Overview

Dataset statistics

Number of variables22
Number of observations45366
Missing cells94468
Missing cells (%)9.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 MiB
Average record size in memory176.0 B

Variable types

Categorical12
Numeric9
DateTime1

Alerts

belongs_to_collection has a high cardinality: 1695 distinct valuesHigh cardinality
genres has a high cardinality: 4064 distinct valuesHigh cardinality
original_language has a high cardinality: 89 distinct valuesHigh cardinality
overview has a high cardinality: 44231 distinct valuesHigh cardinality
production_companies has a high cardinality: 22666 distinct valuesHigh cardinality
production_countries has a high cardinality: 2388 distinct valuesHigh cardinality
spoken_languages has a high cardinality: 1841 distinct valuesHigh cardinality
tagline has a high cardinality: 20269 distinct valuesHigh cardinality
title has a high cardinality: 42195 distinct valuesHigh cardinality
cast has a high cardinality: 42656 distinct valuesHigh cardinality
crew has a high cardinality: 42943 distinct valuesHigh cardinality
budget is highly overall correlated with revenue and 1 other fieldsHigh correlation
popularity is highly overall correlated with vote_countHigh correlation
revenue is highly overall correlated with budget and 2 other fieldsHigh correlation
vote_count is highly overall correlated with popularity and 1 other fieldsHigh correlation
return is highly overall correlated with budget and 1 other fieldsHigh correlation
original_language is highly imbalanced (67.4%)Imbalance
production_countries is highly imbalanced (57.7%)Imbalance
spoken_languages is highly imbalanced (62.0%)Imbalance
status is highly imbalanced (97.0%)Imbalance
belongs_to_collection has 40878 (90.1%) missing valuesMissing
genres has 2383 (5.3%) missing valuesMissing
overview has 941 (2.1%) missing valuesMissing
production_companies has 11792 (26.0%) missing valuesMissing
production_countries has 6208 (13.7%) missing valuesMissing
spoken_languages has 3888 (8.6%) missing valuesMissing
tagline has 24970 (55.0%) missing valuesMissing
cast has 2348 (5.2%) missing valuesMissing
crew has 723 (1.6%) missing valuesMissing
popularity is highly skewed (γ1 = 29.21456901)Skewed
return is highly skewed (γ1 = 138.3142814)Skewed
overview is uniformly distributedUniform
tagline is uniformly distributedUniform
title is uniformly distributedUniform
cast is uniformly distributedUniform
crew is uniformly distributedUniform
budget has 36477 (80.4%) zerosZeros
popularity has 1428 (3.1%) zerosZeros
revenue has 37958 (83.7%) zerosZeros
runtime has 1534 (3.4%) zerosZeros
vote_average has 2947 (6.5%) zerosZeros
vote_count has 2849 (6.3%) zerosZeros
return has 40043 (88.3%) zerosZeros

Reproduction

Analysis started2023-06-09 23:19:48.070327
Analysis finished2023-06-09 23:20:23.442462
Duration35.37 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

belongs_to_collection
Categorical

HIGH CARDINALITY  MISSING 

Distinct1695
Distinct (%)37.8%
Missing40878
Missing (%)90.1%
Memory size354.5 KiB
The Bowery Boys
 
29
Totò Collection
 
27
James Bond Collection
 
26
Zatôichi: The Blind Swordsman
 
26
The Carry On Collection
 
25
Other values (1690)
4355 

Length

Max length54
Median length43
Mean length23.855838
Min length3

Characters and Unicode

Total characters107065
Distinct characters166
Distinct categories12 ?
Distinct scripts7 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique390 ?
Unique (%)8.7%

Sample

1st rowToy Story Collection
2nd rowGrumpy Old Men Collection
3rd rowFather of the Bride Collection
4th rowJames Bond Collection
5th rowBalto Collection

Common Values

ValueCountFrequency (%)
The Bowery Boys 29
 
0.1%
Totò Collection 27
 
0.1%
James Bond Collection 26
 
0.1%
Zatôichi: The Blind Swordsman 26
 
0.1%
The Carry On Collection 25
 
0.1%
Pokémon Collection 22
 
< 0.1%
Charlie Chan (Sidney Toler) Collection 21
 
< 0.1%
Godzilla (Showa) Collection 16
 
< 0.1%
Dragon Ball Z (Movie) Collection 15
 
< 0.1%
Uuno Turhapuro 15
 
< 0.1%
Other values (1685) 4266
 
9.4%
(Missing) 40878
90.1%

Length

2023-06-09T20:20:23.627463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
collection 3743
25.3%
the 1146
 
7.8%
of 230
 
1.6%
series 147
 
1.0%
139
 
0.9%
trilogy 87
 
0.6%
and 84
 
0.6%
man 62
 
0.4%
a 62
 
0.4%
in 56
 
0.4%
Other values (2407) 9028
61.1%

Most occurring characters

ValueCountFrequency (%)
o 11114
 
10.4%
e 10450
 
9.8%
10297
 
9.6%
l 10200
 
9.5%
i 7559
 
7.1%
n 7403
 
6.9%
t 6488
 
6.1%
c 4845
 
4.5%
C 4474
 
4.2%
a 4459
 
4.2%
Other values (156) 29776
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 81103
75.8%
Uppercase Letter 13885
 
13.0%
Space Separator 10297
 
9.6%
Other Punctuation 576
 
0.5%
Open Punctuation 335
 
0.3%
Close Punctuation 335
 
0.3%
Decimal Number 321
 
0.3%
Dash Punctuation 162
 
0.2%
Other Letter 37
 
< 0.1%
Final Punctuation 9
 
< 0.1%
Other values (2) 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 11114
13.7%
e 10450
12.9%
l 10200
12.6%
i 7559
9.3%
n 7403
9.1%
t 6488
8.0%
c 4845
 
6.0%
a 4459
 
5.5%
r 3870
 
4.8%
s 2588
 
3.2%
Other values (69) 12127
15.0%
Uppercase Letter
ValueCountFrequency (%)
C 4474
32.2%
T 1527
 
11.0%
S 1063
 
7.7%
B 682
 
4.9%
M 630
 
4.5%
A 509
 
3.7%
D 505
 
3.6%
H 462
 
3.3%
P 432
 
3.1%
G 417
 
3.0%
Other values (33) 3184
22.9%
Other Letter
ValueCountFrequency (%)
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
3
 
8.1%
2
 
5.4%
Other values (4) 8
21.6%
Other Punctuation
ValueCountFrequency (%)
. 172
29.9%
' 107
18.6%
: 99
17.2%
, 79
13.7%
& 52
 
9.0%
! 35
 
6.1%
/ 21
 
3.6%
? 4
 
0.7%
* 4
 
0.7%
3
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 80
24.9%
9 64
19.9%
3 54
16.8%
0 51
15.9%
2 21
 
6.5%
8 13
 
4.0%
5 12
 
3.7%
7 11
 
3.4%
6 10
 
3.1%
4 5
 
1.6%
Open Punctuation
ValueCountFrequency (%)
( 330
98.5%
[ 5
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 330
98.5%
] 5
 
1.5%
Dash Punctuation
ValueCountFrequency (%)
- 160
98.8%
2
 
1.2%
Space Separator
ValueCountFrequency (%)
10297
100.0%
Final Punctuation
ValueCountFrequency (%)
9
100.0%
Modifier Letter
ValueCountFrequency (%)
3
100.0%
Other Number
ValueCountFrequency (%)
½ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 94574
88.3%
Common 12040
 
11.2%
Cyrillic 414
 
0.4%
Hiragana 15
 
< 0.1%
Hangul 10
 
< 0.1%
Katakana 9
 
< 0.1%
Han 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 11114
11.8%
e 10450
11.0%
l 10200
10.8%
i 7559
 
8.0%
n 7403
 
7.8%
t 6488
 
6.9%
c 4845
 
5.1%
C 4474
 
4.7%
a 4459
 
4.7%
r 3870
 
4.1%
Other values (70) 23712
25.1%
Cyrillic
ValueCountFrequency (%)
л 48
 
11.6%
и 41
 
9.9%
о 37
 
8.9%
к 30
 
7.2%
е 27
 
6.5%
я 25
 
6.0%
а 17
 
4.1%
ц 16
 
3.9%
К 16
 
3.9%
р 14
 
3.4%
Other values (32) 143
34.5%
Common
ValueCountFrequency (%)
10297
85.5%
( 330
 
2.7%
) 330
 
2.7%
. 172
 
1.4%
- 160
 
1.3%
' 107
 
0.9%
: 99
 
0.8%
1 80
 
0.7%
, 79
 
0.7%
9 64
 
0.5%
Other values (20) 322
 
2.7%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
3
20.0%
3
20.0%
3
20.0%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%
Katakana
ValueCountFrequency (%)
3
33.3%
3
33.3%
3
33.3%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106351
99.3%
Cyrillic 414
 
0.4%
None 246
 
0.2%
Hiragana 15
 
< 0.1%
Punctuation 14
 
< 0.1%
Katakana 12
 
< 0.1%
Hangul 10
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 11114
 
10.5%
e 10450
 
9.8%
10297
 
9.7%
l 10200
 
9.6%
i 7559
 
7.1%
n 7403
 
7.0%
t 6488
 
6.1%
c 4845
 
4.6%
C 4474
 
4.2%
a 4459
 
4.2%
Other values (67) 29062
27.3%
Cyrillic
ValueCountFrequency (%)
л 48
 
11.6%
и 41
 
9.9%
о 37
 
8.9%
к 30
 
7.2%
е 27
 
6.5%
я 25
 
6.0%
а 17
 
4.1%
ц 16
 
3.9%
К 16
 
3.9%
р 14
 
3.4%
Other values (32) 143
34.5%
None
ValueCountFrequency (%)
é 45
18.3%
ä 40
16.3%
ô 35
14.2%
ò 28
11.4%
ö 19
7.7%
ó 14
 
5.7%
ı 14
 
5.7%
í 9
 
3.7%
á 4
 
1.6%
İ 4
 
1.6%
Other values (19) 34
13.8%
Punctuation
ValueCountFrequency (%)
9
64.3%
3
 
21.4%
2
 
14.3%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
3
20.0%
3
20.0%
3
20.0%
Katakana
ValueCountFrequency (%)
3
25.0%
3
25.0%
3
25.0%
3
25.0%
CJK
ValueCountFrequency (%)
3
100.0%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

budget
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1223
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4235014.7
Minimum0
Maximum3.8 × 108
Zeros36477
Zeros (%)80.4%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:23.947461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile25000000
Maximum3.8 × 108
Range3.8 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17442463
Coefficient of variation (CV)4.1186309
Kurtosis66.604548
Mean4235014.7
Median Absolute Deviation (MAD)0
Skewness7.1164523
Sum1.9212568 × 1011
Variance3.0423951 × 1014
MonotonicityNot monotonic
2023-06-09T20:20:24.203463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 36477
80.4%
5000000 286
 
0.6%
10000000 259
 
0.6%
20000000 243
 
0.5%
2000000 242
 
0.5%
15000000 226
 
0.5%
3000000 223
 
0.5%
25000000 206
 
0.5%
1000000 197
 
0.4%
30000000 192
 
0.4%
Other values (1213) 6815
 
15.0%
ValueCountFrequency (%)
0 36477
80.4%
1 25
 
0.1%
2 14
 
< 0.1%
3 9
 
< 0.1%
4 8
 
< 0.1%
5 8
 
< 0.1%
6 5
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
380000000 1
 
< 0.1%
300000000 1
 
< 0.1%
280000000 1
 
< 0.1%
270000000 1
 
< 0.1%
260000000 3
 
< 0.1%
258000000 1
 
< 0.1%
255000000 1
 
< 0.1%
250000000 10
< 0.1%
245000000 2
 
< 0.1%
237000000 1
 
< 0.1%

genres
Categorical

HIGH CARDINALITY  MISSING 

Distinct4064
Distinct (%)9.5%
Missing2383
Missing (%)5.3%
Memory size354.5 KiB
Drama
5001 
Comedy
3620 
Documentary
 
2713
Drama, Romance
 
1300
Comedy, Drama
 
1133
Other values (4059)
29216 

Length

Max length80
Median length65
Mean length16.46188
Min length3

Characters and Unicode

Total characters707581
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2366 ?
Unique (%)5.5%

Sample

1st rowAnimation, Comedy, Family
2nd rowAdventure, Fantasy, Family
3rd rowRomance, Comedy
4th rowComedy, Drama, Romance
5th rowComedy

Common Values

ValueCountFrequency (%)
Drama 5001
 
11.0%
Comedy 3620
 
8.0%
Documentary 2713
 
6.0%
Drama, Romance 1300
 
2.9%
Comedy, Drama 1133
 
2.5%
Horror 974
 
2.1%
Comedy, Romance 930
 
2.0%
Comedy, Drama, Romance 593
 
1.3%
Drama, Comedy 531
 
1.2%
Horror, Thriller 528
 
1.2%
Other values (4054) 25660
56.6%
(Missing) 2383
 
5.3%

Length

2023-06-09T20:20:24.462458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
drama 20250
21.4%
comedy 13178
13.9%
thriller 7618
 
8.0%
romance 6734
 
7.1%
action 6590
 
7.0%
horror 4669
 
4.9%
crime 4306
 
4.5%
documentary 3921
 
4.1%
adventure 3493
 
3.7%
science 3039
 
3.2%
Other values (12) 21021
22.2%

Most occurring characters

ValueCountFrequency (%)
r 69055
 
9.8%
a 61800
 
8.7%
e 55751
 
7.9%
m 53087
 
7.5%
51836
 
7.3%
o 48511
 
6.9%
, 48031
 
6.8%
i 39638
 
5.6%
n 35648
 
5.0%
y 28500
 
4.0%
Other values (20) 215724
30.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 512129
72.4%
Uppercase Letter 95585
 
13.5%
Space Separator 51836
 
7.3%
Other Punctuation 48031
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 69055
13.5%
a 61800
12.1%
e 55751
10.9%
m 53087
10.4%
o 48511
9.5%
i 39638
7.7%
n 35648
7.0%
y 28500
5.6%
c 27960
5.5%
t 26186
 
5.1%
Other values (7) 65993
12.9%
Uppercase Letter
ValueCountFrequency (%)
D 24171
25.3%
C 17484
18.3%
A 12013
12.6%
F 9737
10.2%
T 8384
 
8.8%
R 6734
 
7.0%
H 6066
 
6.3%
M 4826
 
5.0%
S 3039
 
3.2%
W 2365
 
2.5%
Space Separator
ValueCountFrequency (%)
51836
100.0%
Other Punctuation
ValueCountFrequency (%)
, 48031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 607714
85.9%
Common 99867
 
14.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 69055
11.4%
a 61800
 
10.2%
e 55751
 
9.2%
m 53087
 
8.7%
o 48511
 
8.0%
i 39638
 
6.5%
n 35648
 
5.9%
y 28500
 
4.7%
c 27960
 
4.6%
t 26186
 
4.3%
Other values (18) 161578
26.6%
Common
ValueCountFrequency (%)
51836
51.9%
, 48031
48.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 707581
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 69055
 
9.8%
a 61800
 
8.7%
e 55751
 
7.9%
m 53087
 
7.5%
51836
 
7.3%
o 48511
 
6.9%
, 48031
 
6.8%
i 39638
 
5.6%
n 35648
 
5.0%
y 28500
 
4.0%
Other values (20) 215724
30.5%

id
Real number (ℝ)

Distinct45345
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108023.61
Minimum2
Maximum469172
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:24.707460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5334.25
Q126387.25
median59857.5
Q3156500.5
95-th percentile357148.75
Maximum469172
Range469170
Interquartile range (IQR)130113.25

Descriptive statistics

Standard deviation112165.81
Coefficient of variation (CV)1.0383454
Kurtosis0.55962819
Mean108023.61
Median Absolute Deviation (MAD)44418.5
Skewness1.283115
Sum4.9005989 × 109
Variance1.2581169 × 1010
MonotonicityNot monotonic
2023-06-09T20:20:24.932458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4912 4
 
< 0.1%
110428 4
 
< 0.1%
132641 4
 
< 0.1%
69234 2
 
< 0.1%
77221 2
 
< 0.1%
159849 2
 
< 0.1%
84198 2
 
< 0.1%
22649 2
 
< 0.1%
12600 2
 
< 0.1%
10991 2
 
< 0.1%
Other values (45335) 45340
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
ValueCountFrequency (%)
469172 1
< 0.1%
468707 1
< 0.1%
468343 1
< 0.1%
467731 1
< 0.1%
465044 1
< 0.1%
464819 1
< 0.1%
464207 1
< 0.1%
464111 1
< 0.1%
463906 1
< 0.1%
463800 1
< 0.1%

original_language
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct89
Distinct (%)0.2%
Missing11
Missing (%)< 0.1%
Memory size354.5 KiB
en
32196 
fr
 
2438
it
 
1528
ja
 
1351
de
 
1077
Other values (84)
6765 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters90710
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
en 32196
71.0%
fr 2438
 
5.4%
it 1528
 
3.4%
ja 1351
 
3.0%
de 1077
 
2.4%
es 991
 
2.2%
ru 822
 
1.8%
hi 508
 
1.1%
ko 444
 
1.0%
zh 408
 
0.9%
Other values (79) 3592
 
7.9%

Length

2023-06-09T20:20:25.112467image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 32196
71.0%
fr 2438
 
5.4%
it 1528
 
3.4%
ja 1351
 
3.0%
de 1077
 
2.4%
es 991
 
2.2%
ru 822
 
1.8%
hi 508
 
1.1%
ko 444
 
1.0%
zh 408
 
0.9%
Other values (79) 3592
 
7.9%

Most occurring characters

ValueCountFrequency (%)
e 34519
38.1%
n 32904
36.3%
r 3631
 
4.0%
f 2833
 
3.1%
i 2386
 
2.6%
t 2250
 
2.5%
a 1839
 
2.0%
s 1650
 
1.8%
j 1352
 
1.5%
d 1321
 
1.5%
Other values (16) 6025
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 90710
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34519
38.1%
n 32904
36.3%
r 3631
 
4.0%
f 2833
 
3.1%
i 2386
 
2.6%
t 2250
 
2.5%
a 1839
 
2.0%
s 1650
 
1.8%
j 1352
 
1.5%
d 1321
 
1.5%
Other values (16) 6025
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 90710
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34519
38.1%
n 32904
36.3%
r 3631
 
4.0%
f 2833
 
3.1%
i 2386
 
2.6%
t 2250
 
2.5%
a 1839
 
2.0%
s 1650
 
1.8%
j 1352
 
1.5%
d 1321
 
1.5%
Other values (16) 6025
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90710
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 34519
38.1%
n 32904
36.3%
r 3631
 
4.0%
f 2833
 
3.1%
i 2386
 
2.6%
t 2250
 
2.5%
a 1839
 
2.0%
s 1650
 
1.8%
j 1352
 
1.5%
d 1321
 
1.5%
Other values (16) 6025
 
6.6%

overview
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct44231
Distinct (%)99.6%
Missing941
Missing (%)2.1%
Memory size354.5 KiB
No overview found.
 
133
No Overview
 
7
 
5
Winter, 1915. Confined by her family to an asylum in the South of France - where she will never sculpt again - the chronicle of Camille Claudel's reclusive life, as she waits for a visit from her brother, Paul Claudel.
 
4
Ten years into a marriage, the wife is disappointed by the husband's lack of financial success, meaning she has to work and can't treat herself and the husband finds the wife slovenly and mean-spirited: she neither cooks not cleans particularly well and is generally disagreeable. In turn, he alternately ignores her and treats her as a servant. Neither is particularly happy, not helped by their unsatisfactory lodgers. The husband is easily seduced by an ex-colleague, a widow with a small child who needs some security, and considers leaving his wife.
 
4
Other values (44226)
44272 

Length

Max length1000
Median length786
Mean length323.29738
Min length1

Characters and Unicode

Total characters14362486
Distinct characters429
Distinct categories25 ?
Distinct scripts13 ?
Distinct blocks21 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44185 ?
Unique (%)99.5%

Sample

1st rowLed by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.
2nd rowWhen siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures.
3rd rowA family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max.
4th rowCheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe.
5th rowJust when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own.

Common Values

ValueCountFrequency (%)
No overview found. 133
 
0.3%
No Overview 7
 
< 0.1%
5
 
< 0.1%
Winter, 1915. Confined by her family to an asylum in the South of France - where she will never sculpt again - the chronicle of Camille Claudel's reclusive life, as she waits for a visit from her brother, Paul Claudel. 4
 
< 0.1%
Ten years into a marriage, the wife is disappointed by the husband's lack of financial success, meaning she has to work and can't treat herself and the husband finds the wife slovenly and mean-spirited: she neither cooks not cleans particularly well and is generally disagreeable. In turn, he alternately ignores her and treats her as a servant. Neither is particularly happy, not helped by their unsatisfactory lodgers. The husband is easily seduced by an ex-colleague, a widow with a small child who needs some security, and considers leaving his wife. 4
 
< 0.1%
Television made him famous, but his biggest hits happened off screen. Television producer by day, CIA assassin by night, Chuck Barris was recruited by the CIA at the height of his TV career and trained to become a covert operative. Or so Barris said. 4
 
< 0.1%
No movie overview available. 3
 
< 0.1%
A few funny little novels about different aspects of life. 3
 
< 0.1%
Adaptation of the Jane Austen novel. 3
 
< 0.1%
A group of travelers, including a monk, stay in a lonely inn in the mountains. The host confesses the monk his habit of serving poisoned soup to the guests, to rob their possessions and to bury them in the backyard. The story unfolds as the monk tries to save the guest's lives without violating the holy secrecy of the confession. 2
 
< 0.1%
Other values (44221) 44257
97.6%
(Missing) 941
 
2.1%

Length

2023-06-09T20:20:25.317459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 138054
 
5.6%
a 98868
 
4.0%
and 75243
 
3.1%
to 73297
 
3.0%
of 69558
 
2.8%
in 48132
 
2.0%
is 36498
 
1.5%
his 36152
 
1.5%
with 23893
 
1.0%
her 21478
 
0.9%
Other values (97091) 1826995
74.6%

Most occurring characters

ValueCountFrequency (%)
2405820
16.8%
e 1363533
 
9.5%
a 940287
 
6.5%
t 934520
 
6.5%
i 851319
 
5.9%
o 829629
 
5.8%
n 822396
 
5.7%
s 767677
 
5.3%
r 744084
 
5.2%
h 600669
 
4.2%
Other values (419) 4102552
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11147505
77.6%
Space Separator 2405858
 
16.8%
Uppercase Letter 390904
 
2.7%
Other Punctuation 312765
 
2.2%
Decimal Number 42221
 
0.3%
Dash Punctuation 36763
 
0.3%
Close Punctuation 10097
 
0.1%
Open Punctuation 10074
 
0.1%
Final Punctuation 4553
 
< 0.1%
Initial Punctuation 881
 
< 0.1%
Other values (15) 865
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1363533
12.2%
a 940287
 
8.4%
t 934520
 
8.4%
i 851319
 
7.6%
o 829629
 
7.4%
n 822396
 
7.4%
s 767677
 
6.9%
r 744084
 
6.7%
h 600669
 
5.4%
l 478724
 
4.3%
Other values (142) 2814667
25.2%
Uppercase Letter
ValueCountFrequency (%)
A 42750
 
10.9%
T 35968
 
9.2%
S 31119
 
8.0%
M 23947
 
6.1%
B 23696
 
6.1%
C 22812
 
5.8%
H 19425
 
5.0%
W 18646
 
4.8%
I 16796
 
4.3%
D 16309
 
4.2%
Other values (77) 139436
35.7%
Other Letter
ValueCountFrequency (%)
6
 
4.8%
6
 
4.8%
5
 
4.0%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
م 2
 
1.6%
2
 
1.6%
Other values (76) 88
70.4%
Other Punctuation
ValueCountFrequency (%)
, 133411
42.7%
. 124771
39.9%
' 31118
 
9.9%
" 11661
 
3.7%
: 3298
 
1.1%
? 2759
 
0.9%
; 2493
 
0.8%
! 1543
 
0.5%
/ 765
 
0.2%
& 453
 
0.1%
Other values (12) 493
 
0.2%
Nonspacing Mark
ValueCountFrequency (%)
́ 4
12.1%
ి 4
12.1%
̈ 3
9.1%
3
9.1%
3
9.1%
3
9.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
Other values (4) 5
15.2%
Decimal Number
ValueCountFrequency (%)
1 9748
23.1%
0 8265
19.6%
9 6406
15.2%
2 4250
10.1%
5 2442
 
5.8%
8 2378
 
5.6%
3 2340
 
5.5%
4 2176
 
5.2%
7 2131
 
5.0%
6 2085
 
4.9%
Spacing Mark
ValueCountFrequency (%)
11
40.7%
4
 
14.8%
3
 
11.1%
3
 
11.1%
ि 2
 
7.4%
2
 
7.4%
1
 
3.7%
ி 1
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
- 35240
95.9%
881
 
2.4%
633
 
1.7%
5
 
< 0.1%
4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
® 45
70.3%
14
 
21.9%
¦ 2
 
3.1%
° 2
 
3.1%
1
 
1.6%
Math Symbol
ValueCountFrequency (%)
~ 20
50.0%
+ 11
27.5%
= 6
 
15.0%
| 2
 
5.0%
1
 
2.5%
Open Punctuation
ValueCountFrequency (%)
( 10021
99.5%
[ 50
 
0.5%
{ 2
 
< 0.1%
1
 
< 0.1%
Currency Symbol
ValueCountFrequency (%)
$ 317
96.4%
£ 10
 
3.0%
1
 
0.3%
1
 
0.3%
Space Separator
ValueCountFrequency (%)
2405820
> 99.9%
  36
 
< 0.1%
  2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 10045
99.5%
] 50
 
0.5%
} 2
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
3845
84.4%
689
 
15.1%
» 19
 
0.4%
Initial Punctuation
ValueCountFrequency (%)
671
76.2%
192
 
21.8%
« 18
 
2.0%
Control
ValueCountFrequency (%)
106
96.4%
’ 3
 
2.7%
 1
 
0.9%
Modifier Symbol
ValueCountFrequency (%)
´ 25
65.8%
` 12
31.6%
¯ 1
 
2.6%
Format
ValueCountFrequency (%)
31
60.8%
­ 20
39.2%
Other Number
ValueCountFrequency (%)
½ 8
50.0%
¹ 8
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%
Line Separator
ValueCountFrequency (%)
7
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Paragraph Separator
ValueCountFrequency (%)
2
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11533177
80.3%
Common 2823890
 
19.7%
Cyrillic 4587
 
< 0.1%
Greek 648
 
< 0.1%
Devanagari 77
 
< 0.1%
Telugu 30
 
< 0.1%
Hiragana 20
 
< 0.1%
Tamil 19
 
< 0.1%
Han 10
 
< 0.1%
Hangul 9
 
< 0.1%
Other values (3) 19
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1363533
11.8%
a 940287
 
8.2%
t 934520
 
8.1%
i 851319
 
7.4%
o 829629
 
7.2%
n 822396
 
7.1%
s 767677
 
6.7%
r 744084
 
6.5%
h 600669
 
5.2%
l 478724
 
4.2%
Other values (132) 3200339
27.7%
Common
ValueCountFrequency (%)
2405820
85.2%
, 133411
 
4.7%
. 124771
 
4.4%
- 35240
 
1.2%
' 31118
 
1.1%
" 11661
 
0.4%
) 10045
 
0.4%
( 10021
 
0.4%
1 9748
 
0.3%
0 8265
 
0.3%
Other values (71) 43790
 
1.6%
Cyrillic
ValueCountFrequency (%)
о 470
 
10.2%
е 404
 
8.8%
а 373
 
8.1%
н 323
 
7.0%
и 299
 
6.5%
т 265
 
5.8%
р 240
 
5.2%
с 218
 
4.8%
в 173
 
3.8%
л 161
 
3.5%
Other values (46) 1661
36.2%
Greek
ValueCountFrequency (%)
α 60
 
9.3%
ο 55
 
8.5%
τ 43
 
6.6%
ι 36
 
5.6%
η 36
 
5.6%
ν 34
 
5.2%
ρ 31
 
4.8%
ε 31
 
4.8%
ς 30
 
4.6%
π 30
 
4.6%
Other values (33) 262
40.4%
Devanagari
ValueCountFrequency (%)
11
 
14.3%
6
 
7.8%
6
 
7.8%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (21) 30
39.0%
Hiragana
ValueCountFrequency (%)
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (7) 7
35.0%
Telugu
ValueCountFrequency (%)
ి 4
13.3%
3
10.0%
3
10.0%
3
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
Other values (6) 6
20.0%
Tamil
ValueCountFrequency (%)
3
15.8%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (3) 3
15.8%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Thai
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
م 2
50.0%
ہ 1
25.0%
ت 1
25.0%
Inherited
ValueCountFrequency (%)
́ 4
57.1%
̈ 3
42.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14344494
99.9%
Punctuation 7266
 
0.1%
None 5928
 
< 0.1%
Cyrillic 4587
 
< 0.1%
Devanagari 77
 
< 0.1%
Telugu 30
 
< 0.1%
Hiragana 20
 
< 0.1%
Tamil 19
 
< 0.1%
Letterlike Symbols 14
 
< 0.1%
CJK 10
 
< 0.1%
Other values (11) 41
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2405820
16.8%
e 1363533
 
9.5%
a 940287
 
6.6%
t 934520
 
6.5%
i 851319
 
5.9%
o 829629
 
5.8%
n 822396
 
5.7%
s 767677
 
5.4%
r 744084
 
5.2%
h 600669
 
4.2%
Other values (82) 4084560
28.5%
Punctuation
ValueCountFrequency (%)
3845
52.9%
881
 
12.1%
689
 
9.5%
671
 
9.2%
633
 
8.7%
303
 
4.2%
192
 
2.6%
31
 
0.4%
7
 
0.1%
5
 
0.1%
Other values (4) 9
 
0.1%
None
ValueCountFrequency (%)
é 1550
26.1%
ä 294
 
5.0%
á 293
 
4.9%
ö 250
 
4.2%
í 243
 
4.1%
è 209
 
3.5%
ü 178
 
3.0%
ı 165
 
2.8%
ó 164
 
2.8%
ç 158
 
2.7%
Other values (141) 2424
40.9%
Cyrillic
ValueCountFrequency (%)
о 470
 
10.2%
е 404
 
8.8%
а 373
 
8.1%
н 323
 
7.0%
и 299
 
6.5%
т 265
 
5.8%
р 240
 
5.2%
с 218
 
4.8%
в 173
 
3.8%
л 161
 
3.5%
Other values (46) 1661
36.2%
Letterlike Symbols
ValueCountFrequency (%)
14
100.0%
Devanagari
ValueCountFrequency (%)
11
 
14.3%
6
 
7.8%
6
 
7.8%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (21) 30
39.0%
Alphabetic PF
ValueCountFrequency (%)
4
100.0%
Hiragana
ValueCountFrequency (%)
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (7) 7
35.0%
Diacriticals
ValueCountFrequency (%)
́ 4
57.1%
̈ 3
42.9%
Telugu
ValueCountFrequency (%)
ి 4
13.3%
3
10.0%
3
10.0%
3
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
Other values (6) 6
20.0%
Tamil
ValueCountFrequency (%)
3
15.8%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (3) 3
15.8%
Arabic
ValueCountFrequency (%)
م 2
50.0%
ہ 1
25.0%
ت 1
25.0%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Number Forms
ValueCountFrequency (%)
2
100.0%
Thai
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Modifier Letters
ValueCountFrequency (%)
ʼ 2
100.0%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
Katakana
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
Specials
ValueCountFrequency (%)
1
100.0%

popularity
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct2017
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9264705
Minimum0
Maximum547.49
Zeros1428
Zeros (%)3.1%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:25.546459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.02
Q10.39
median1.13
Q33.69
95-th percentile11.06
Maximum547.49
Range547.49
Interquartile range (IQR)3.3

Descriptive statistics

Standard deviation6.0101494
Coefficient of variation (CV)2.0537194
Kurtosis1923.5152
Mean2.9264705
Median Absolute Deviation (MAD)0.97
Skewness29.214569
Sum132762.26
Variance36.121895
MonotonicityNot monotonic
2023-06-09T20:20:25.732457image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1428
 
3.1%
0.01 651
 
1.4%
0.04 591
 
1.3%
0.08 411
 
0.9%
0.11 409
 
0.9%
0.15 381
 
0.8%
0.07 349
 
0.8%
0.05 337
 
0.7%
0.12 319
 
0.7%
0.02 292
 
0.6%
Other values (2007) 40198
88.6%
ValueCountFrequency (%)
0 1428
3.1%
0.01 651
1.4%
0.02 292
 
0.6%
0.03 164
 
0.4%
0.04 591
1.3%
0.05 337
 
0.7%
0.06 281
 
0.6%
0.07 349
 
0.8%
0.08 411
 
0.9%
0.09 257
 
0.6%
ValueCountFrequency (%)
547.49 1
< 0.1%
294.34 1
< 0.1%
287.25 1
< 0.1%
228.03 1
< 0.1%
213.85 1
< 0.1%
187.86 1
< 0.1%
185.33 1
< 0.1%
185.07 1
< 0.1%
183.87 1
< 0.1%
154.8 1
< 0.1%

production_companies
Categorical

HIGH CARDINALITY  MISSING 

Distinct22666
Distinct (%)67.5%
Missing11792
Missing (%)26.0%
Memory size354.5 KiB
Metro-Goldwyn-Mayer (MGM)
 
742
Warner Bros.
 
540
Paramount Pictures
 
505
Twentieth Century Fox Film Corporation
 
439
Universal Pictures
 
320
Other values (22661)
31028 

Length

Max length609
Median length412
Mean length41.490886
Min length2

Characters and Unicode

Total characters1393015
Distinct characters294
Distinct categories17 ?
Distinct scripts6 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20309 ?
Unique (%)60.5%

Sample

1st rowPixar Animation Studios
2nd rowTriStar Pictures, Teitler Film, Interscope Communications
3rd rowWarner Bros., Lancaster Gate
4th rowTwentieth Century Fox Film Corporation
5th rowSandollar Productions, Touchstone Pictures

Common Values

ValueCountFrequency (%)
Metro-Goldwyn-Mayer (MGM) 742
 
1.6%
Warner Bros. 540
 
1.2%
Paramount Pictures 505
 
1.1%
Twentieth Century Fox Film Corporation 439
 
1.0%
Universal Pictures 320
 
0.7%
RKO Radio Pictures 247
 
0.5%
Columbia Pictures Corporation 207
 
0.5%
Columbia Pictures 146
 
0.3%
Mosfilm 145
 
0.3%
Walt Disney Pictures 85
 
0.2%
Other values (22656) 30198
66.6%
(Missing) 11792
 
26.0%

Length

2023-06-09T20:20:25.964459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
films 9456
 
5.3%
pictures 9266
 
5.2%
productions 9058
 
5.1%
film 6672
 
3.8%
entertainment 5153
 
2.9%
corporation 2189
 
1.2%
company 1770
 
1.0%
warner 1478
 
0.8%
bros 1411
 
0.8%
the 1381
 
0.8%
Other values (18616) 129810
73.1%

Most occurring characters

ValueCountFrequency (%)
144079
 
10.3%
i 106901
 
7.7%
e 94614
 
6.8%
n 89944
 
6.5%
o 85269
 
6.1%
r 83528
 
6.0%
t 83401
 
6.0%
a 77130
 
5.5%
s 62653
 
4.5%
l 51244
 
3.7%
Other values (284) 514252
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 986721
70.8%
Uppercase Letter 198935
 
14.3%
Space Separator 144084
 
10.3%
Other Punctuation 45100
 
3.2%
Decimal Number 4349
 
0.3%
Dash Punctuation 4329
 
0.3%
Open Punctuation 4325
 
0.3%
Close Punctuation 4324
 
0.3%
Math Symbol 663
 
< 0.1%
Other Letter 140
 
< 0.1%
Other values (7) 45
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 106901
10.8%
e 94614
9.6%
n 89944
9.1%
o 85269
8.6%
r 83528
8.5%
t 83401
8.5%
a 77130
 
7.8%
s 62653
 
6.3%
l 51244
 
5.2%
m 44264
 
4.5%
Other values (102) 207773
21.1%
Other Letter
ValueCountFrequency (%)
9
 
6.4%
8
 
5.7%
6
 
4.3%
5
 
3.6%
5
 
3.6%
5
 
3.6%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.1%
Other values (62) 85
60.7%
Uppercase Letter
ValueCountFrequency (%)
P 27879
14.0%
F 26351
13.2%
C 20583
 
10.3%
M 13359
 
6.7%
S 11908
 
6.0%
E 9744
 
4.9%
A 9554
 
4.8%
T 9355
 
4.7%
B 9000
 
4.5%
G 7806
 
3.9%
Other values (52) 53396
26.8%
Other Punctuation
ValueCountFrequency (%)
, 37346
82.8%
. 5681
 
12.6%
& 764
 
1.7%
/ 644
 
1.4%
' 451
 
1.0%
" 133
 
0.3%
! 36
 
0.1%
% 18
 
< 0.1%
: 9
 
< 0.1%
@ 5
 
< 0.1%
Other values (6) 13
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 1034
23.8%
1 712
16.4%
0 641
14.7%
3 558
12.8%
4 481
11.1%
9 205
 
4.7%
6 195
 
4.5%
5 178
 
4.1%
8 173
 
4.0%
7 172
 
4.0%
Open Punctuation
ValueCountFrequency (%)
( 4315
99.8%
[ 9
 
0.2%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 4314
99.8%
] 9
 
0.2%
1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
144079
> 99.9%
  5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4327
> 99.9%
2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 662
99.8%
| 1
 
0.2%
Other Symbol
ValueCountFrequency (%)
° 23
92.0%
2
 
8.0%
Final Punctuation
ValueCountFrequency (%)
3
50.0%
» 3
50.0%
Other Number
ValueCountFrequency (%)
² 1
50.0%
½ 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Control
ValueCountFrequency (%)
4
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 3
100.0%
Format
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1185253
85.1%
Common 207217
 
14.9%
Cyrillic 373
 
< 0.1%
Hangul 115
 
< 0.1%
Greek 31
 
< 0.1%
Han 26
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 106901
 
9.0%
e 94614
 
8.0%
n 89944
 
7.6%
o 85269
 
7.2%
r 83528
 
7.0%
t 83401
 
7.0%
a 77130
 
6.5%
s 62653
 
5.3%
l 51244
 
4.3%
m 44264
 
3.7%
Other values (99) 406305
34.3%
Hangul
ValueCountFrequency (%)
9
 
7.8%
8
 
7.0%
6
 
5.2%
5
 
4.3%
5
 
4.3%
5
 
4.3%
5
 
4.3%
5
 
4.3%
4
 
3.5%
3
 
2.6%
Other values (43) 60
52.2%
Common
ValueCountFrequency (%)
144079
69.5%
, 37346
 
18.0%
. 5681
 
2.7%
- 4327
 
2.1%
( 4315
 
2.1%
) 4314
 
2.1%
2 1034
 
0.5%
& 764
 
0.4%
1 712
 
0.3%
+ 662
 
0.3%
Other values (37) 3983
 
1.9%
Cyrillic
ValueCountFrequency (%)
и 34
 
9.1%
о 28
 
7.5%
а 26
 
7.0%
л 22
 
5.9%
н 20
 
5.4%
м 19
 
5.1%
т 17
 
4.6%
е 16
 
4.3%
с 16
 
4.3%
ь 16
 
4.3%
Other values (36) 159
42.6%
Greek
ValueCountFrequency (%)
ν 3
 
9.7%
ο 3
 
9.7%
Κ 2
 
6.5%
ρ 2
 
6.5%
τ 2
 
6.5%
η 2
 
6.5%
Ε 2
 
6.5%
λ 2
 
6.5%
ι 2
 
6.5%
έ 1
 
3.2%
Other values (10) 10
32.3%
Han
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (9) 9
34.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1386786
99.6%
None 5710
 
0.4%
Cyrillic 373
 
< 0.1%
Hangul 113
 
< 0.1%
CJK 26
 
< 0.1%
Punctuation 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
144079
 
10.4%
i 106901
 
7.7%
e 94614
 
6.8%
n 89944
 
6.5%
o 85269
 
6.1%
r 83528
 
6.0%
t 83401
 
6.0%
a 77130
 
5.6%
s 62653
 
4.5%
l 51244
 
3.7%
Other values (77) 508023
36.6%
None
ValueCountFrequency (%)
é 3176
55.6%
ó 416
 
7.3%
á 317
 
5.6%
í 173
 
3.0%
ü 154
 
2.7%
ñ 150
 
2.6%
ô 140
 
2.5%
è 136
 
2.4%
ä 136
 
2.4%
ö 132
 
2.3%
Other values (76) 780
 
13.7%
Cyrillic
ValueCountFrequency (%)
и 34
 
9.1%
о 28
 
7.5%
а 26
 
7.0%
л 22
 
5.9%
н 20
 
5.4%
м 19
 
5.1%
т 17
 
4.6%
е 16
 
4.3%
с 16
 
4.3%
ь 16
 
4.3%
Other values (36) 159
42.6%
Hangul
ValueCountFrequency (%)
9
 
8.0%
8
 
7.1%
6
 
5.3%
5
 
4.4%
5
 
4.4%
5
 
4.4%
5
 
4.4%
5
 
4.4%
4
 
3.5%
3
 
2.7%
Other values (42) 58
51.3%
Punctuation
ValueCountFrequency (%)
3
42.9%
2
28.6%
1
 
14.3%
1
 
14.3%
CJK
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (9) 9
34.6%

production_countries
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct2388
Distinct (%)6.1%
Missing6208
Missing (%)13.7%
Memory size354.5 KiB
United States of America
17845 
United Kingdom
2235 
France
 
1655
Japan
 
1358
Italy
 
1029
Other values (2383)
15036 

Length

Max length237
Median length167
Mean length19.045355
Min length4

Characters and Unicode

Total characters745778
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1767 ?
Unique (%)4.5%

Sample

1st rowUnited States of America
2nd rowUnited States of America
3rd rowUnited States of America
4th rowUnited States of America
5th rowUnited States of America

Common Values

ValueCountFrequency (%)
United States of America 17845
39.3%
United Kingdom 2235
 
4.9%
France 1655
 
3.6%
Japan 1358
 
3.0%
Italy 1029
 
2.3%
Canada 840
 
1.9%
Germany 748
 
1.6%
India 735
 
1.6%
Russia 734
 
1.6%
United Kingdom, United States of America 569
 
1.3%
Other values (2378) 11410
25.2%
(Missing) 6208
 
13.7%

Length

2023-06-09T20:20:26.212460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united 25263
21.3%
states 21147
17.8%
of 21146
17.8%
america 21146
17.8%
kingdom 4089
 
3.4%
france 3937
 
3.3%
germany 2257
 
1.9%
italy 2167
 
1.8%
canada 1765
 
1.5%
japan 1650
 
1.4%
Other values (177) 14162
11.9%

Most occurring characters

ValueCountFrequency (%)
e 80630
 
10.8%
79571
 
10.7%
t 72613
 
9.7%
a 70475
 
9.4%
i 58538
 
7.8%
n 47476
 
6.4%
d 34534
 
4.6%
r 32478
 
4.4%
o 29574
 
4.0%
m 28694
 
3.8%
Other values (43) 211195
28.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 558427
74.9%
Uppercase Letter 97545
 
13.1%
Space Separator 79571
 
10.7%
Other Punctuation 10235
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 80630
14.4%
t 72613
13.0%
a 70475
12.6%
i 58538
10.5%
n 47476
8.5%
d 34534
6.2%
r 32478
5.8%
o 29574
 
5.3%
m 28694
 
5.1%
c 26368
 
4.7%
Other values (16) 77047
13.8%
Uppercase Letter
ValueCountFrequency (%)
U 25364
26.0%
S 23833
24.4%
A 22388
23.0%
K 5216
 
5.3%
F 4330
 
4.4%
I 3581
 
3.7%
C 2594
 
2.7%
G 2470
 
2.5%
J 1666
 
1.7%
R 1307
 
1.3%
Other values (14) 4796
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 10230
> 99.9%
' 5
 
< 0.1%
Space Separator
ValueCountFrequency (%)
79571
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 655972
88.0%
Common 89806
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 80630
12.3%
t 72613
11.1%
a 70475
10.7%
i 58538
 
8.9%
n 47476
 
7.2%
d 34534
 
5.3%
r 32478
 
5.0%
o 29574
 
4.5%
m 28694
 
4.4%
c 26368
 
4.0%
Other values (40) 174592
26.6%
Common
ValueCountFrequency (%)
79571
88.6%
, 10230
 
11.4%
' 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 745778
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 80630
 
10.8%
79571
 
10.7%
t 72613
 
9.7%
a 70475
 
9.4%
i 58538
 
7.8%
n 47476
 
6.4%
d 34534
 
4.6%
r 32478
 
4.4%
o 29574
 
4.0%
m 28694
 
3.8%
Other values (43) 211195
28.3%
Distinct17333
Distinct (%)38.2%
Missing0
Missing (%)0.0%
Memory size354.5 KiB
Minimum1874-12-09 00:00:00
Maximum2020-12-16 00:00:00
2023-06-09T20:20:26.440457image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:26.656458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

revenue
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6863
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11233994
Minimum0
Maximum2.7879651 × 109
Zeros37958
Zeros (%)83.7%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:26.858501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile48025328
Maximum2.7879651 × 109
Range2.7879651 × 109
Interquartile range (IQR)0

Descriptive statistics

Standard deviation64396963
Coefficient of variation (CV)5.73233
Kurtosis237.0229
Mean11233994
Median Absolute Deviation (MAD)0
Skewness12.253245
Sum5.0964139 × 1011
Variance4.1469688 × 1015
MonotonicityNot monotonic
2023-06-09T20:20:27.062500image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 37958
83.7%
12000000 20
 
< 0.1%
10000000 19
 
< 0.1%
11000000 19
 
< 0.1%
2000000 18
 
< 0.1%
6000000 17
 
< 0.1%
5000000 14
 
< 0.1%
8000000 13
 
< 0.1%
500000 13
 
< 0.1%
1 12
 
< 0.1%
Other values (6853) 7263
 
16.0%
ValueCountFrequency (%)
0 37958
83.7%
1 12
 
< 0.1%
2 3
 
< 0.1%
3 9
 
< 0.1%
4 4
 
< 0.1%
5 5
 
< 0.1%
6 2
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
2787965087 1
< 0.1%
2068223624 1
< 0.1%
1845034188 1
< 0.1%
1519557910 1
< 0.1%
1513528810 1
< 0.1%
1506249360 1
< 0.1%
1405403694 1
< 0.1%
1342000000 1
< 0.1%
1274219009 1
< 0.1%
1262886337 1
< 0.1%

runtime
Real number (ℝ)

Distinct353
Distinct (%)0.8%
Missing246
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean94.181738
Minimum0
Maximum1256
Zeros1534
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:27.277458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q185
median95
Q3107
95-th percentile138
Maximum1256
Range1256
Interquartile range (IQR)22

Descriptive statistics

Standard deviation38.34118
Coefficient of variation (CV)0.40709781
Kurtosis93.944905
Mean94.181738
Median Absolute Deviation (MAD)11
Skewness4.4919715
Sum4249480
Variance1470.0461
MonotonicityNot monotonic
2023-06-09T20:20:27.693458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 2548
 
5.6%
0 1534
 
3.4%
100 1470
 
3.2%
95 1412
 
3.1%
93 1213
 
2.7%
96 1104
 
2.4%
92 1078
 
2.4%
94 1062
 
2.3%
91 1055
 
2.3%
88 1030
 
2.3%
Other values (343) 31614
69.7%
ValueCountFrequency (%)
0 1534
3.4%
1 107
 
0.2%
2 33
 
0.1%
3 48
 
0.1%
4 50
 
0.1%
5 51
 
0.1%
6 72
 
0.2%
7 103
 
0.2%
8 78
 
0.2%
9 63
 
0.1%
ValueCountFrequency (%)
1256 1
< 0.1%
1140 2
< 0.1%
931 1
< 0.1%
925 1
< 0.1%
900 1
< 0.1%
877 1
< 0.1%
874 1
< 0.1%
840 2
< 0.1%
780 1
< 0.1%
720 1
< 0.1%

spoken_languages
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct1841
Distinct (%)4.4%
Missing3888
Missing (%)8.6%
Memory size354.5 KiB
English
22377 
Français
 
1853
日本語
 
1291
Italiano
 
1217
Español
 
901
Other values (1836)
13839 

Length

Max length171
Median length7
Mean length9.3976807
Min length2

Characters and Unicode

Total characters389797
Distinct characters171
Distinct categories8 ?
Distinct scripts15 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1294 ?
Unique (%)3.1%

Sample

1st rowEnglish
2nd rowEnglish, Français
3rd rowEnglish
4th rowEnglish
5th rowEnglish

Common Values

ValueCountFrequency (%)
English 22377
49.3%
Français 1853
 
4.1%
日本語 1291
 
2.8%
Italiano 1217
 
2.7%
Español 901
 
2.0%
Pусский 807
 
1.8%
Deutsch 760
 
1.7%
English, Français 681
 
1.5%
English, Español 572
 
1.3%
हिन्दी 480
 
1.1%
Other values (1831) 10539
23.2%
(Missing) 3888
 
8.6%

Length

2023-06-09T20:20:27.932502image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
english 28725
52.8%
français 4194
 
7.7%
deutsch 2623
 
4.8%
español 2412
 
4.4%
italiano 2366
 
4.4%
日本語 1760
 
3.2%
pусский 1562
 
2.9%
普通话 790
 
1.5%
हिन्दी 706
 
1.3%
663
 
1.2%
Other values (69) 8559
 
15.7%

Most occurring characters

ValueCountFrequency (%)
s 42259
10.8%
n 37456
 
9.6%
i 37103
 
9.5%
l 34627
 
8.9%
h 31454
 
8.1%
E 31194
 
8.0%
g 30409
 
7.8%
a 18944
 
4.9%
13076
 
3.4%
, 11663
 
3.0%
Other values (161) 101612
26.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 291973
74.9%
Uppercase Letter 46421
 
11.9%
Other Letter 22189
 
5.7%
Space Separator 13076
 
3.4%
Other Punctuation 12728
 
3.3%
Spacing Mark 1836
 
0.5%
Nonspacing Mark 1548
 
0.4%
Control 26
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 42259
14.5%
n 37456
12.8%
i 37103
12.7%
l 34627
11.9%
h 31454
10.8%
g 30409
10.4%
a 18944
6.5%
o 7050
 
2.4%
r 6127
 
2.1%
t 5976
 
2.0%
Other values (63) 40568
13.9%
Other Letter
ValueCountFrequency (%)
1760
 
7.9%
1760
 
7.9%
1760
 
7.9%
1263
 
5.7%
946
 
4.3%
790
 
3.6%
790
 
3.6%
706
 
3.2%
706
 
3.2%
706
 
3.2%
Other values (46) 11002
49.6%
Uppercase Letter
ValueCountFrequency (%)
E 31194
67.2%
F 4196
 
9.0%
D 2924
 
6.3%
P 2677
 
5.8%
I 2366
 
5.1%
N 828
 
1.8%
L 505
 
1.1%
M 362
 
0.8%
T 308
 
0.7%
Č 284
 
0.6%
Other values (13) 777
 
1.7%
Spacing Mark
ValueCountFrequency (%)
706
38.5%
ि 706
38.5%
136
 
7.4%
ி 111
 
6.0%
94
 
5.1%
47
 
2.6%
18
 
1.0%
18
 
1.0%
Nonspacing Mark
ValueCountFrequency (%)
706
45.6%
ִ 430
27.8%
ְ 215
 
13.9%
111
 
7.2%
68
 
4.4%
18
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 11663
91.6%
/ 1015
 
8.0%
? 50
 
0.4%
Space Separator
ValueCountFrequency (%)
13076
100.0%
Control
ValueCountFrequency (%)
š 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 326005
83.6%
Common 25830
 
6.6%
Han 10488
 
2.7%
Cyrillic 10454
 
2.7%
Devanagari 4236
 
1.1%
Arabic 3339
 
0.9%
Hangul 3252
 
0.8%
Hebrew 1720
 
0.4%
Greek 1704
 
0.4%
Thai 1232
 
0.3%
Other values (5) 1537
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 42259
13.0%
n 37456
11.5%
i 37103
11.4%
l 34627
10.6%
h 31454
9.6%
E 31194
9.6%
g 30409
9.3%
a 18944
 
5.8%
o 7050
 
2.2%
r 6127
 
1.9%
Other values (50) 49382
15.1%
Cyrillic
ValueCountFrequency (%)
с 3211
30.7%
к 1734
16.6%
и 1679
16.1%
й 1615
15.4%
у 1564
15.0%
а 113
 
1.1%
р 87
 
0.8%
У 53
 
0.5%
ї 53
 
0.5%
н 53
 
0.5%
Other values (12) 292
 
2.8%
Arabic
ValueCountFrequency (%)
ا 536
16.1%
ر 536
16.1%
ل 341
10.2%
ع 341
10.2%
ب 341
10.2%
ي 341
10.2%
ة 341
10.2%
ی 140
 
4.2%
ف 140
 
4.2%
س 140
 
4.2%
Other values (5) 142
 
4.3%
Han
ValueCountFrequency (%)
1760
16.8%
1760
16.8%
1760
16.8%
1263
12.0%
946
9.0%
790
7.5%
790
7.5%
广 473
 
4.5%
473
 
4.5%
473
 
4.5%
Hebrew
ValueCountFrequency (%)
ִ 430
25.0%
ת 215
12.5%
י 215
12.5%
ר 215
12.5%
ְ 215
12.5%
ב 215
12.5%
ע 215
12.5%
Greek
ValueCountFrequency (%)
λ 426
25.0%
ά 213
12.5%
κ 213
12.5%
ι 213
12.5%
ν 213
12.5%
η 213
12.5%
ε 213
12.5%
Georgian
ValueCountFrequency (%)
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
Devanagari
ValueCountFrequency (%)
706
16.7%
706
16.7%
706
16.7%
706
16.7%
706
16.7%
ि 706
16.7%
Hangul
ValueCountFrequency (%)
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
Thai
ValueCountFrequency (%)
352
28.6%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
Gurmukhi
ValueCountFrequency (%)
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
Common
ValueCountFrequency (%)
13076
50.6%
, 11663
45.2%
/ 1015
 
3.9%
? 50
 
0.2%
š 26
 
0.1%
Telugu
ValueCountFrequency (%)
136
33.3%
68
16.7%
68
16.7%
68
16.7%
68
16.7%
Tamil
ValueCountFrequency (%)
111
20.0%
ி 111
20.0%
111
20.0%
111
20.0%
111
20.0%
Bengali
ValueCountFrequency (%)
94
40.0%
47
20.0%
47
20.0%
47
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 342979
88.0%
CJK 10488
 
2.7%
Cyrillic 10454
 
2.7%
None 10434
 
2.7%
Devanagari 4236
 
1.1%
Arabic 3339
 
0.9%
Hangul 3252
 
0.8%
Hebrew 1720
 
0.4%
Thai 1232
 
0.3%
Tamil 555
 
0.1%
Other values (6) 1108
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 42259
12.3%
n 37456
10.9%
i 37103
10.8%
l 34627
10.1%
h 31454
9.2%
E 31194
9.1%
g 30409
8.9%
a 18944
 
5.5%
13076
 
3.8%
, 11663
 
3.4%
Other values (38) 54794
16.0%
None
ValueCountFrequency (%)
ç 4441
42.6%
ñ 2412
23.1%
ê 591
 
5.7%
λ 426
 
4.1%
ý 284
 
2.7%
Č 284
 
2.7%
ü 247
 
2.4%
ά 213
 
2.0%
κ 213
 
2.0%
ι 213
 
2.0%
Other values (11) 1110
 
10.6%
Cyrillic
ValueCountFrequency (%)
с 3211
30.7%
к 1734
16.6%
и 1679
16.1%
й 1615
15.4%
у 1564
15.0%
а 113
 
1.1%
р 87
 
0.8%
У 53
 
0.5%
ї 53
 
0.5%
н 53
 
0.5%
Other values (12) 292
 
2.8%
CJK
ValueCountFrequency (%)
1760
16.8%
1760
16.8%
1760
16.8%
1263
12.0%
946
9.0%
790
7.5%
790
7.5%
广 473
 
4.5%
473
 
4.5%
473
 
4.5%
Devanagari
ValueCountFrequency (%)
706
16.7%
706
16.7%
706
16.7%
706
16.7%
706
16.7%
ि 706
16.7%
Hangul
ValueCountFrequency (%)
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
Arabic
ValueCountFrequency (%)
ا 536
16.1%
ر 536
16.1%
ل 341
10.2%
ع 341
10.2%
ب 341
10.2%
ي 341
10.2%
ة 341
10.2%
ی 140
 
4.2%
ف 140
 
4.2%
س 140
 
4.2%
Other values (5) 142
 
4.3%
Hebrew
ValueCountFrequency (%)
ִ 430
25.0%
ת 215
12.5%
י 215
12.5%
ר 215
12.5%
ְ 215
12.5%
ב 215
12.5%
ע 215
12.5%
Thai
ValueCountFrequency (%)
352
28.6%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
176
14.3%
Telugu
ValueCountFrequency (%)
136
33.3%
68
16.7%
68
16.7%
68
16.7%
68
16.7%
Tamil
ValueCountFrequency (%)
111
20.0%
ி 111
20.0%
111
20.0%
111
20.0%
111
20.0%
Bengali
ValueCountFrequency (%)
94
40.0%
47
20.0%
47
20.0%
47
20.0%
Latin Ext Additional
ValueCountFrequency (%)
ế 61
50.0%
61
50.0%
Georgian
ValueCountFrequency (%)
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
Gurmukhi
ValueCountFrequency (%)
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
IPA Ext
ValueCountFrequency (%)
ə 4
100.0%

status
Categorical

Distinct6
Distinct (%)< 0.1%
Missing80
Missing (%)0.2%
Memory size354.5 KiB
Released
44927 
Rumored
 
229
Post Production
 
97
In Production
 
19
Planned
 
13

Length

Max length15
Median length8
Mean length8.0117476
Min length7

Characters and Unicode

Total characters362820
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowReleased
2nd rowReleased
3rd rowReleased
4th rowReleased
5th rowReleased

Common Values

ValueCountFrequency (%)
Released 44927
99.0%
Rumored 229
 
0.5%
Post Production 97
 
0.2%
In Production 19
 
< 0.1%
Planned 13
 
< 0.1%
Canceled 1
 
< 0.1%
(Missing) 80
 
0.2%

Length

2023-06-09T20:20:28.150459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-09T20:20:28.357500image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
released 44927
99.0%
rumored 229
 
0.5%
production 116
 
0.3%
post 97
 
0.2%
in 19
 
< 0.1%
planned 13
 
< 0.1%
canceled 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 135025
37.2%
d 45286
 
12.5%
R 45156
 
12.4%
s 45024
 
12.4%
l 44941
 
12.4%
a 44941
 
12.4%
o 558
 
0.2%
r 345
 
0.1%
u 345
 
0.1%
m 229
 
0.1%
Other values (8) 970
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 317302
87.5%
Uppercase Letter 45402
 
12.5%
Space Separator 116
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 135025
42.6%
d 45286
 
14.3%
s 45024
 
14.2%
l 44941
 
14.2%
a 44941
 
14.2%
o 558
 
0.2%
r 345
 
0.1%
u 345
 
0.1%
m 229
 
0.1%
t 213
 
0.1%
Other values (3) 395
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
R 45156
99.5%
P 226
 
0.5%
I 19
 
< 0.1%
C 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 362704
> 99.9%
Common 116
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 135025
37.2%
d 45286
 
12.5%
R 45156
 
12.4%
s 45024
 
12.4%
l 44941
 
12.4%
a 44941
 
12.4%
o 558
 
0.2%
r 345
 
0.1%
u 345
 
0.1%
m 229
 
0.1%
Other values (7) 854
 
0.2%
Common
ValueCountFrequency (%)
116
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 362820
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 135025
37.2%
d 45286
 
12.5%
R 45156
 
12.4%
s 45024
 
12.4%
l 44941
 
12.4%
a 44941
 
12.4%
o 558
 
0.2%
r 345
 
0.1%
u 345
 
0.1%
m 229
 
0.1%
Other values (8) 970
 
0.3%

tagline
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct20269
Distinct (%)99.4%
Missing24970
Missing (%)55.0%
Memory size354.5 KiB
Based on a true story.
 
7
Some things are better left top secret.
 
4
Be careful what you wish for.
 
4
-
 
4
Trust no one.
 
4
Other values (20264)
20373 

Length

Max length297
Median length204
Mean length46.998333
Min length1

Characters and Unicode

Total characters958578
Distinct characters170
Distinct categories17 ?
Distinct scripts6 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20166 ?
Unique (%)98.9%

Sample

1st rowRoll the dice and unleash the excitement!
2nd rowStill Yelling. Still Fighting. Still Ready for Love.
3rd rowFriends are the people who let you be yourself... and never let you forget it.
4th rowJust When His World Is Back To Normal... He's In For The Surprise Of His Life!
5th rowA Los Angeles Crime Saga

Common Values

ValueCountFrequency (%)
Based on a true story. 7
 
< 0.1%
Some things are better left top secret. 4
 
< 0.1%
Be careful what you wish for. 4
 
< 0.1%
- 4
 
< 0.1%
Trust no one. 4
 
< 0.1%
Documentary 3
 
< 0.1%
The end is near. 3
 
< 0.1%
There are two sides to every love story. 3
 
< 0.1%
Drama 3
 
< 0.1%
Classic Albums 3
 
< 0.1%
Other values (20259) 20358
44.9%
(Missing) 24970
55.0%

Length

2023-06-09T20:20:28.582460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 10993
 
6.3%
a 6812
 
3.9%
of 4403
 
2.5%
to 3582
 
2.1%
is 2793
 
1.6%
in 2693
 
1.5%
and 2682
 
1.5%
you 2389
 
1.4%
1580
 
0.9%
for 1523
 
0.9%
Other values (15100) 134460
77.3%

Most occurring characters

ValueCountFrequency (%)
153662
16.0%
e 94404
 
9.8%
t 57263
 
6.0%
o 56557
 
5.9%
a 51467
 
5.4%
n 47494
 
5.0%
i 46029
 
4.8%
r 44978
 
4.7%
s 42358
 
4.4%
h 37161
 
3.9%
Other values (160) 327205
34.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 680397
71.0%
Space Separator 153662
 
16.0%
Uppercase Letter 74988
 
7.8%
Other Punctuation 44582
 
4.7%
Decimal Number 2687
 
0.3%
Dash Punctuation 1942
 
0.2%
Final Punctuation 98
 
< 0.1%
Open Punctuation 56
 
< 0.1%
Close Punctuation 55
 
< 0.1%
Currency Symbol 37
 
< 0.1%
Other values (7) 74
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 94404
13.9%
t 57263
 
8.4%
o 56557
 
8.3%
a 51467
 
7.6%
n 47494
 
7.0%
i 46029
 
6.8%
r 44978
 
6.6%
s 42358
 
6.2%
h 37161
 
5.5%
l 30172
 
4.4%
Other values (43) 172514
25.4%
Other Letter
ValueCountFrequency (%)
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (24) 24
70.6%
Uppercase Letter
ValueCountFrequency (%)
T 10008
 
13.3%
A 6873
 
9.2%
S 5653
 
7.5%
H 4402
 
5.9%
I 4387
 
5.9%
E 4306
 
5.7%
W 3679
 
4.9%
O 3477
 
4.6%
N 3195
 
4.3%
L 3194
 
4.3%
Other values (20) 25814
34.4%
Other Punctuation
ValueCountFrequency (%)
. 26648
59.8%
! 5784
 
13.0%
' 5674
 
12.7%
, 4224
 
9.5%
? 1159
 
2.6%
" 582
 
1.3%
148
 
0.3%
: 138
 
0.3%
& 83
 
0.2%
* 42
 
0.1%
Other values (7) 100
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 802
29.8%
1 516
19.2%
2 299
 
11.1%
9 208
 
7.7%
3 208
 
7.7%
5 168
 
6.3%
4 140
 
5.2%
6 121
 
4.5%
7 121
 
4.5%
8 104
 
3.9%
Math Symbol
ValueCountFrequency (%)
= 5
35.7%
+ 5
35.7%
| 2
 
14.3%
~ 1
 
7.1%
1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 1925
99.1%
9
 
0.5%
8
 
0.4%
Final Punctuation
ValueCountFrequency (%)
82
83.7%
15
 
15.3%
» 1
 
1.0%
Initial Punctuation
ValueCountFrequency (%)
14
73.7%
4
 
21.1%
« 1
 
5.3%
Open Punctuation
ValueCountFrequency (%)
( 49
87.5%
[ 7
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 48
87.3%
] 7
 
12.7%
Other Number
ValueCountFrequency (%)
½ 2
66.7%
² 1
33.3%
Modifier Letter
ValueCountFrequency (%)
ˌ 1
50.0%
ˈ 1
50.0%
Space Separator
ValueCountFrequency (%)
153662
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 37
100.0%
Nonspacing Mark
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 755385
78.8%
Common 203158
 
21.2%
Han 21
 
< 0.1%
Tamil 5
 
< 0.1%
Hiragana 5
 
< 0.1%
Katakana 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 94404
 
12.5%
t 57263
 
7.6%
o 56557
 
7.5%
a 51467
 
6.8%
n 47494
 
6.3%
i 46029
 
6.1%
r 44978
 
6.0%
s 42358
 
5.6%
h 37161
 
4.9%
l 30172
 
4.0%
Other values (73) 247502
32.8%
Common
ValueCountFrequency (%)
153662
75.6%
. 26648
 
13.1%
! 5784
 
2.8%
' 5674
 
2.8%
, 4224
 
2.1%
- 1925
 
0.9%
? 1159
 
0.6%
0 802
 
0.4%
" 582
 
0.3%
1 516
 
0.3%
Other values (42) 2182
 
1.1%
Han
ValueCountFrequency (%)
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (11) 11
52.4%
Tamil
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Hiragana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Katakana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 958148
> 99.9%
Punctuation 280
 
< 0.1%
None 110
 
< 0.1%
CJK 21
 
< 0.1%
Tamil 5
 
< 0.1%
Hiragana 5
 
< 0.1%
Katakana 4
 
< 0.1%
IPA Ext 2
 
< 0.1%
Modifier Letters 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
153662
16.0%
e 94404
 
9.9%
t 57263
 
6.0%
o 56557
 
5.9%
a 51467
 
5.4%
n 47494
 
5.0%
i 46029
 
4.8%
r 44978
 
4.7%
s 42358
 
4.4%
h 37161
 
3.9%
Other values (78) 326775
34.1%
Punctuation
ValueCountFrequency (%)
148
52.9%
82
29.3%
15
 
5.4%
14
 
5.0%
9
 
3.2%
8
 
2.9%
4
 
1.4%
None
ValueCountFrequency (%)
é 18
16.4%
ä 16
14.5%
ö 8
 
7.3%
ó 6
 
5.5%
á 6
 
5.5%
í 5
 
4.5%
ı 5
 
4.5%
ü 5
 
4.5%
· 4
 
3.6%
ñ 3
 
2.7%
Other values (26) 34
30.9%
IPA Ext
ValueCountFrequency (%)
ə 2
100.0%
Tamil
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
CJK
ValueCountFrequency (%)
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (11) 11
52.4%
Katakana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Modifier Letters
ValueCountFrequency (%)
ˌ 1
50.0%
ˈ 1
50.0%
Hiragana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

title
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct42195
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Memory size354.5 KiB
Cinderella
 
11
Alice in Wonderland
 
9
Hamlet
 
9
Les Misérables
 
8
Beauty and the Beast
 
8
Other values (42190)
45321 

Length

Max length105
Median length79
Mean length16.703611
Min length1

Characters and Unicode

Total characters757776
Distinct characters287
Distinct categories17 ?
Distinct scripts7 ?
Distinct blocks12 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39877 ?
Unique (%)87.9%

Sample

1st rowToy Story
2nd rowJumanji
3rd rowGrumpier Old Men
4th rowWaiting to Exhale
5th rowFather of the Bride Part II

Common Values

ValueCountFrequency (%)
Cinderella 11
 
< 0.1%
Alice in Wonderland 9
 
< 0.1%
Hamlet 9
 
< 0.1%
Les Misérables 8
 
< 0.1%
Beauty and the Beast 8
 
< 0.1%
The Three Musketeers 7
 
< 0.1%
Treasure Island 7
 
< 0.1%
A Christmas Carol 7
 
< 0.1%
The Hound of the Baskervilles 6
 
< 0.1%
Countdown 6
 
< 0.1%
Other values (42185) 45288
99.8%

Length

2023-06-09T20:20:28.829501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 14550
 
10.7%
of 4930
 
3.6%
a 2243
 
1.6%
in 1693
 
1.2%
and 1631
 
1.2%
to 1054
 
0.8%
757
 
0.6%
man 665
 
0.5%
love 664
 
0.5%
for 601
 
0.4%
Other values (24353) 107377
78.9%

Most occurring characters

ValueCountFrequency (%)
90821
 
12.0%
e 76236
 
10.1%
a 48933
 
6.5%
o 45664
 
6.0%
n 40820
 
5.4%
r 40005
 
5.3%
i 39768
 
5.2%
t 36716
 
4.8%
s 29516
 
3.9%
h 28508
 
3.8%
Other values (277) 280789
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 534068
70.5%
Uppercase Letter 117247
 
15.5%
Space Separator 90821
 
12.0%
Other Punctuation 10487
 
1.4%
Decimal Number 3858
 
0.5%
Dash Punctuation 981
 
0.1%
Close Punctuation 87
 
< 0.1%
Open Punctuation 85
 
< 0.1%
Final Punctuation 38
 
< 0.1%
Other Letter 25
 
< 0.1%
Other values (7) 79
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 76236
14.3%
a 48933
9.2%
o 45664
 
8.6%
n 40820
 
7.6%
r 40005
 
7.5%
i 39768
 
7.4%
t 36716
 
6.9%
s 29516
 
5.5%
h 28508
 
5.3%
l 25927
 
4.9%
Other values (121) 121975
22.8%
Uppercase Letter
ValueCountFrequency (%)
T 16013
13.7%
S 10333
 
8.8%
M 8032
 
6.9%
B 7655
 
6.5%
C 7170
 
6.1%
A 6785
 
5.8%
D 6334
 
5.4%
L 5869
 
5.0%
H 5170
 
4.4%
W 5167
 
4.4%
Other values (65) 38719
33.0%
Other Letter
ValueCountFrequency (%)
چ 2
 
8.0%
ه 2
 
8.0%
ک 2
 
8.0%
ی 2
 
8.0%
1
 
4.0%
1
 
4.0%
ª 1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (11) 11
44.0%
Other Punctuation
ValueCountFrequency (%)
: 3717
35.4%
' 2504
23.9%
. 1603
15.3%
, 1133
 
10.8%
! 647
 
6.2%
& 458
 
4.4%
? 269
 
2.6%
/ 79
 
0.8%
* 19
 
0.2%
# 13
 
0.1%
Other values (8) 45
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 861
22.3%
1 701
18.2%
0 616
16.0%
3 482
12.5%
9 232
 
6.0%
4 229
 
5.9%
5 227
 
5.9%
7 193
 
5.0%
8 161
 
4.2%
6 156
 
4.0%
Math Symbol
ValueCountFrequency (%)
+ 17
70.8%
× 3
 
12.5%
1
 
4.2%
= 1
 
4.2%
1
 
4.2%
1
 
4.2%
Other Number
ValueCountFrequency (%)
½ 12
63.2%
² 3
 
15.8%
³ 2
 
10.5%
1
 
5.3%
1
 
5.3%
Other Symbol
ValueCountFrequency (%)
° 3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Currency Symbol
ValueCountFrequency (%)
$ 18
85.7%
¢ 2
 
9.5%
£ 1
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 966
98.5%
15
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 82
94.3%
] 5
 
5.7%
Open Punctuation
ValueCountFrequency (%)
( 80
94.1%
[ 5
 
5.9%
Final Punctuation
ValueCountFrequency (%)
37
97.4%
1
 
2.6%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
90821
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Format
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 650800
85.9%
Common 106436
 
14.0%
Cyrillic 346
 
< 0.1%
Greek 170
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
Han 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 76236
 
11.7%
a 48933
 
7.5%
o 45664
 
7.0%
n 40820
 
6.3%
r 40005
 
6.1%
i 39768
 
6.1%
t 36716
 
5.6%
s 29516
 
4.5%
h 28508
 
4.4%
l 25927
 
4.0%
Other values (107) 238707
36.7%
Common
ValueCountFrequency (%)
90821
85.3%
: 3717
 
3.5%
' 2504
 
2.4%
. 1603
 
1.5%
, 1133
 
1.1%
- 966
 
0.9%
2 861
 
0.8%
1 701
 
0.7%
! 647
 
0.6%
0 616
 
0.6%
Other values (50) 2867
 
2.7%
Cyrillic
ValueCountFrequency (%)
о 32
 
9.2%
е 32
 
9.2%
а 29
 
8.4%
н 24
 
6.9%
и 23
 
6.6%
р 22
 
6.4%
к 17
 
4.9%
с 15
 
4.3%
т 14
 
4.0%
в 14
 
4.0%
Other values (38) 124
35.8%
Greek
ValueCountFrequency (%)
α 20
 
11.8%
ο 14
 
8.2%
ι 14
 
8.2%
τ 9
 
5.3%
ρ 8
 
4.7%
ά 8
 
4.7%
λ 8
 
4.7%
ν 7
 
4.1%
ς 6
 
3.5%
ε 6
 
3.5%
Other values (32) 70
41.2%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
چ 2
18.2%
ه 2
18.2%
ک 2
18.2%
ی 2
18.2%
س 1
9.1%
ا 1
9.1%
ج 1
9.1%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 756212
99.8%
None 1123
 
0.1%
Cyrillic 346
 
< 0.1%
Punctuation 62
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
CJK 5
 
< 0.1%
Misc Symbols 3
 
< 0.1%
Letterlike Symbols 2
 
< 0.1%
Math Operators 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
90821
 
12.0%
e 76236
 
10.1%
a 48933
 
6.5%
o 45664
 
6.0%
n 40820
 
5.4%
r 40005
 
5.3%
i 39768
 
5.3%
t 36716
 
4.9%
s 29516
 
3.9%
h 28508
 
3.8%
Other values (76) 279225
36.9%
None
ValueCountFrequency (%)
é 218
19.4%
ä 127
 
11.3%
ö 55
 
4.9%
è 53
 
4.7%
ô 44
 
3.9%
ü 39
 
3.5%
ó 37
 
3.3%
á 35
 
3.1%
ı 35
 
3.1%
í 33
 
2.9%
Other values (108) 447
39.8%
Punctuation
ValueCountFrequency (%)
37
59.7%
15
24.2%
5
 
8.1%
2
 
3.2%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Cyrillic
ValueCountFrequency (%)
о 32
 
9.2%
е 32
 
9.2%
а 29
 
8.4%
н 24
 
6.9%
и 23
 
6.6%
р 22
 
6.4%
к 17
 
4.9%
с 15
 
4.3%
т 14
 
4.0%
в 14
 
4.0%
Other values (38) 124
35.8%
Arabic
ValueCountFrequency (%)
چ 2
18.2%
ه 2
18.2%
ک 2
18.2%
ی 2
18.2%
س 1
9.1%
ا 1
9.1%
ج 1
9.1%
Misc Symbols
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Operators
ValueCountFrequency (%)
1
50.0%
1
50.0%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arrows
ValueCountFrequency (%)
1
100.0%

vote_average
Real number (ℝ)

Distinct92
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.6239673
Minimum0
Maximum10
Zeros2947
Zeros (%)6.5%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:29.059496image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median6
Q36.8
95-th percentile7.8
Maximum10
Range10
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation1.9155471
Coefficient of variation (CV)0.34060424
Kurtosis2.5414062
Mean5.6239673
Median Absolute Deviation (MAD)0.9
Skewness-1.5244317
Sum255136.9
Variance3.6693206
MonotonicityNot monotonic
2023-06-09T20:20:29.264503image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2947
 
6.5%
6 2462
 
5.4%
5 1996
 
4.4%
7 1885
 
4.2%
6.5 1722
 
3.8%
6.3 1602
 
3.5%
5.5 1381
 
3.0%
5.8 1369
 
3.0%
6.4 1349
 
3.0%
6.7 1339
 
3.0%
Other values (82) 27314
60.2%
ValueCountFrequency (%)
0 2947
6.5%
0.5 13
 
< 0.1%
0.7 1
 
< 0.1%
1 103
 
0.2%
1.1 1
 
< 0.1%
1.2 4
 
< 0.1%
1.3 13
 
< 0.1%
1.4 5
 
< 0.1%
1.5 30
 
0.1%
1.6 6
 
< 0.1%
ValueCountFrequency (%)
10 185
0.4%
9.8 1
 
< 0.1%
9.6 1
 
< 0.1%
9.5 18
 
< 0.1%
9.4 3
 
< 0.1%
9.3 18
 
< 0.1%
9.2 4
 
< 0.1%
9.1 2
 
< 0.1%
9 158
0.3%
8.9 7
 
< 0.1%

vote_count
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1820
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean110.11824
Minimum0
Maximum14075
Zeros2849
Zeros (%)6.3%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:29.464496image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median10
Q334
95-th percentile434
Maximum14075
Range14075
Interquartile range (IQR)31

Descriptive statistics

Standard deviation491.79559
Coefficient of variation (CV)4.4660684
Kurtosis150.89469
Mean110.11824
Median Absolute Deviation (MAD)8
Skewness10.439597
Sum4995624
Variance241862.9
MonotonicityNot monotonic
2023-06-09T20:20:29.678459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3241
 
7.1%
2 3127
 
6.9%
0 2849
 
6.3%
3 2781
 
6.1%
4 2477
 
5.5%
5 2096
 
4.6%
6 1747
 
3.9%
7 1570
 
3.5%
8 1359
 
3.0%
9 1194
 
2.6%
Other values (1810) 22925
50.5%
ValueCountFrequency (%)
0 2849
6.3%
1 3241
7.1%
2 3127
6.9%
3 2781
6.1%
4 2477
5.5%
5 2096
4.6%
6 1747
3.9%
7 1570
3.5%
8 1359
3.0%
9 1194
 
2.6%
ValueCountFrequency (%)
14075 1
< 0.1%
12269 1
< 0.1%
12114 1
< 0.1%
12000 1
< 0.1%
11444 1
< 0.1%
11187 1
< 0.1%
10297 1
< 0.1%
10014 1
< 0.1%
9678 1
< 0.1%
9634 1
< 0.1%

cast
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct42656
Distinct (%)99.2%
Missing2348
Missing (%)5.2%
Memory size354.5 KiB
Georges Méliès
 
24
Louis Theroux
 
15
Mel Blanc
 
12
Jimmy Carr
 
9
George Carlin
 
8
Other values (42651)
42950 

Length

Max length4551
Median length1364
Mean length198.06544
Min length4

Characters and Unicode

Total characters8520379
Distinct characters395
Distinct categories16 ?
Distinct scripts9 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42475 ?
Unique (%)98.7%

Sample

1st rowTom Hanks, Tim Allen, Don Rickles, Jim Varney, Wallace Shawn, John Ratzenberger, Annie Potts, John Morris, Erik von Detten, Laurie Metcalf, R. Lee Ermey, Sarah Freeman, Penn Jillette
2nd rowRobin Williams, Jonathan Hyde, Kirsten Dunst, Bradley Pierce, Bonnie Hunt, Bebe Neuwirth, David Alan Grier, Patricia Clarkson, Adam Hann-Byrd, Laura Bell Bundy, James Handy, Gillian Barber, Brandon Obray, Cyrus Thiedeke, Gary Joseph Thorup, Leonard Zola, Lloyd Berry, Malcolm Stewart, Annabel Kershaw, Darryl Henriques, Robyn Driscoll, Peter Bryant, Sarah Gilson, Florica Vlad, June Lion, Brenda Lockmuller
3rd rowWalter Matthau, Jack Lemmon, Ann-Margret, Sophia Loren, Daryl Hannah, Burgess Meredith, Kevin Pollak
4th rowWhitney Houston, Angela Bassett, Loretta Devine, Lela Rochon, Gregory Hines, Dennis Haysbert, Michael Beach, Mykelti Williamson, Lamont Johnson, Wesley Snipes
5th rowSteve Martin, Diane Keaton, Martin Short, Kimberly Williams-Paisley, George Newbern, Kieran Culkin, BD Wong, Peter Michael Goetz, Kate McGregor-Stewart, Jane Adams, Eugene Levy, Lori Alan

Common Values

ValueCountFrequency (%)
Georges Méliès 24
 
0.1%
Louis Theroux 15
 
< 0.1%
Mel Blanc 12
 
< 0.1%
Jimmy Carr 9
 
< 0.1%
George Carlin 8
 
< 0.1%
Werner Herzog 8
 
< 0.1%
David Attenborough 8
 
< 0.1%
Louis C.K. 8
 
< 0.1%
Ricky Gervais 6
 
< 0.1%
Trevor Noah 6
 
< 0.1%
Other values (42646) 42914
94.6%
(Missing) 2348
 
5.2%

Length

2023-06-09T20:20:29.956460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
john 9804
 
0.8%
michael 7458
 
0.6%
david 6185
 
0.5%
robert 5722
 
0.5%
james 5689
 
0.5%
richard 4446
 
0.4%
paul 4313
 
0.4%
peter 3901
 
0.3%
william 3431
 
0.3%
george 3416
 
0.3%
Other values (112933) 1110657
95.3%

Most occurring characters

ValueCountFrequency (%)
1122132
 
13.2%
a 704925
 
8.3%
e 665316
 
7.8%
n 524106
 
6.2%
, 519485
 
6.1%
r 497363
 
5.8%
i 484022
 
5.7%
o 423803
 
5.0%
l 366466
 
4.3%
s 255868
 
3.0%
Other values (385) 2956893
34.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5651100
66.3%
Uppercase Letter 1190431
 
14.0%
Space Separator 1122135
 
13.2%
Other Punctuation 541788
 
6.4%
Dash Punctuation 14101
 
0.2%
Other Letter 543
 
< 0.1%
Decimal Number 94
 
< 0.1%
Final Punctuation 83
 
< 0.1%
Initial Punctuation 23
 
< 0.1%
Open Punctuation 23
 
< 0.1%
Other values (6) 58
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 704925
12.5%
e 665316
11.8%
n 524106
9.3%
r 497363
 
8.8%
i 484022
 
8.6%
o 423803
 
7.5%
l 366466
 
6.5%
s 255868
 
4.5%
t 253211
 
4.5%
h 197885
 
3.5%
Other values (138) 1278135
22.6%
Other Letter
ValueCountFrequency (%)
ا 32
 
5.9%
م 31
 
5.7%
ی 19
 
3.5%
ع 19
 
3.5%
ن 18
 
3.3%
ر 17
 
3.1%
17
 
3.1%
د 17
 
3.1%
ي 16
 
2.9%
12
 
2.2%
Other values (104) 345
63.5%
Uppercase Letter
ValueCountFrequency (%)
M 109353
 
9.2%
S 92313
 
7.8%
C 84003
 
7.1%
J 83331
 
7.0%
B 82353
 
6.9%
A 70824
 
5.9%
R 67394
 
5.7%
D 65885
 
5.5%
L 61163
 
5.1%
G 54661
 
4.6%
Other values (81) 419151
35.2%
Decimal Number
ValueCountFrequency (%)
5 37
39.4%
0 29
30.9%
1 8
 
8.5%
2 8
 
8.5%
9 4
 
4.3%
7 2
 
2.1%
3 2
 
2.1%
4 2
 
2.1%
8 1
 
1.1%
6 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 519485
95.9%
. 16049
 
3.0%
' 6098
 
1.1%
" 129
 
< 0.1%
· 9
 
< 0.1%
& 6
 
< 0.1%
: 6
 
< 0.1%
! 5
 
< 0.1%
/ 1
 
< 0.1%
Nonspacing Mark
ValueCountFrequency (%)
́ 10
58.8%
2
 
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Final Punctuation
ValueCountFrequency (%)
74
89.2%
6
 
7.2%
» 3
 
3.6%
Space Separator
ValueCountFrequency (%)
1122132
> 99.9%
  3
 
< 0.1%
Initial Punctuation
ValueCountFrequency (%)
20
87.0%
« 3
 
13.0%
Open Punctuation
ValueCountFrequency (%)
14
60.9%
( 9
39.1%
Format
ValueCountFrequency (%)
5
83.3%
1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 14101
100.0%
Control
ValueCountFrequency (%)
21
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6838447
80.3%
Common 1678287
 
19.7%
Cyrillic 3070
 
< 0.1%
Han 276
 
< 0.1%
Arabic 241
 
< 0.1%
Thai 27
 
< 0.1%
Greek 14
 
< 0.1%
Inherited 11
 
< 0.1%
Hangul 6
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 704925
 
10.3%
e 665316
 
9.7%
n 524106
 
7.7%
r 497363
 
7.3%
i 484022
 
7.1%
o 423803
 
6.2%
l 366466
 
5.4%
s 255868
 
3.7%
t 253211
 
3.7%
h 197885
 
2.9%
Other values (163) 2465482
36.1%
Han
ValueCountFrequency (%)
17
 
6.2%
12
 
4.3%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
9
 
3.3%
9
 
3.3%
Other values (55) 163
59.1%
Cyrillic
ValueCountFrequency (%)
а 323
 
10.5%
и 315
 
10.3%
о 233
 
7.6%
н 229
 
7.5%
р 215
 
7.0%
е 174
 
5.7%
л 155
 
5.0%
к 136
 
4.4%
т 115
 
3.7%
с 109
 
3.6%
Other values (51) 1066
34.7%
Common
ValueCountFrequency (%)
1122132
66.9%
, 519485
31.0%
. 16049
 
1.0%
- 14101
 
0.8%
' 6098
 
0.4%
" 129
 
< 0.1%
74
 
< 0.1%
5 37
 
< 0.1%
0 29
 
< 0.1%
21
 
< 0.1%
Other values (24) 132
 
< 0.1%
Arabic
ValueCountFrequency (%)
ا 32
13.3%
م 31
12.9%
ی 19
 
7.9%
ع 19
 
7.9%
ن 18
 
7.5%
ر 17
 
7.1%
د 17
 
7.1%
ي 16
 
6.6%
ل 9
 
3.7%
س 8
 
3.3%
Other values (18) 55
22.8%
Thai
ValueCountFrequency (%)
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (11) 11
40.7%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Greek
ValueCountFrequency (%)
ν 6
42.9%
ί 2
 
14.3%
Ζ 2
 
14.3%
α 2
 
14.3%
ο 2
 
14.3%
Inherited
ValueCountFrequency (%)
́ 10
90.9%
1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8478314
99.5%
None 38259
 
0.4%
Cyrillic 3070
 
< 0.1%
CJK 276
 
< 0.1%
Arabic 241
 
< 0.1%
Punctuation 120
 
< 0.1%
Latin Ext Additional 56
 
< 0.1%
Thai 27
 
< 0.1%
Diacriticals 10
 
< 0.1%
Hangul 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1122132
 
13.2%
a 704925
 
8.3%
e 665316
 
7.8%
n 524106
 
6.2%
, 519485
 
6.1%
r 497363
 
5.9%
i 484022
 
5.7%
o 423803
 
5.0%
l 366466
 
4.3%
s 255868
 
3.0%
Other values (66) 2914828
34.4%
None
ValueCountFrequency (%)
é 9079
23.7%
á 4155
 
10.9%
í 2756
 
7.2%
ô 2333
 
6.1%
ö 2014
 
5.3%
ó 1881
 
4.9%
ü 1492
 
3.9%
ć 1360
 
3.6%
è 1243
 
3.2%
ä 994
 
2.6%
Other values (111) 10952
28.6%
Cyrillic
ValueCountFrequency (%)
а 323
 
10.5%
и 315
 
10.3%
о 233
 
7.6%
н 229
 
7.5%
р 215
 
7.0%
е 174
 
5.7%
л 155
 
5.0%
к 136
 
4.4%
т 115
 
3.7%
с 109
 
3.6%
Other values (51) 1066
34.7%
Punctuation
ValueCountFrequency (%)
74
61.7%
20
 
16.7%
14
 
11.7%
6
 
5.0%
5
 
4.2%
1
 
0.8%
Arabic
ValueCountFrequency (%)
ا 32
13.3%
م 31
12.9%
ی 19
 
7.9%
ع 19
 
7.9%
ن 18
 
7.5%
ر 17
 
7.1%
د 17
 
7.1%
ي 16
 
6.6%
ل 9
 
3.7%
س 8
 
3.3%
Other values (18) 55
22.8%
CJK
ValueCountFrequency (%)
17
 
6.2%
12
 
4.3%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
9
 
3.3%
9
 
3.3%
Other values (55) 163
59.1%
Latin Ext Additional
ValueCountFrequency (%)
15
26.8%
9
16.1%
6
 
10.7%
6
 
10.7%
ế 5
 
8.9%
4
 
7.1%
4
 
7.1%
4
 
7.1%
2
 
3.6%
1
 
1.8%
Diacriticals
ValueCountFrequency (%)
́ 10
100.0%
Thai
ValueCountFrequency (%)
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (11) 11
40.7%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

crew
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct42943
Distinct (%)96.2%
Missing723
Missing (%)1.6%
Memory size354.5 KiB
Director: Georges Méliès
 
35
Director: Christian I. Nyby II
 
13
Director: Norman McLaren
 
12
Director: Charlie Chaplin, Writer: Charlie Chaplin
 
12
Director: Frederick Wiseman
 
12
Other values (42938)
44559 

Length

Max length5043
Median length3354
Mean length233.66304
Min length11

Characters and Unicode

Total characters10431419
Distinct characters333
Distinct categories15 ?
Distinct scripts8 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41852 ?
Unique (%)93.7%

Sample

1st rowDirector: John Lasseter, Screenplay: Alec Sokolow, Producer: Ralph Guggenheim, Executive Producer: Steve Jobs, Editor: Robert Gordon, Art Direction: Ralph Eggleston, Foley Editor: Mary Helen Leasman, Animation: Ken Willard, ADR Editor: Marilyn McCoppen, Orchestrator: Don Davis, Color Timer: Dale E. Grahn, CG Painter: William Cone, Original Story: Andrew Stanton, Post Production Supervisor: Patsy Bouge, Sculptor: Shelley Daniels Lekven, Animation Director: Rich Quade, Music: Randy Newman, Layout: Desirée Mourad, Music Editor: James Flamberg, Negative Cutter: Rick Mackay, Title Designer: Susan Bradley, Supervising Technical Director: William Reeves, Songs: Randy Newman, Supervising Animator: Pete Docter, Sound Designer: Gary Rydstrom, Production Supervisor: Karen Robert Jackson, Executive Music Producer: Chris Montan, Visual Effects Supervisor: Thomas Porter, Visual Effects: Brian M. Rosen, Lighting Supervisor: Galyn Susman, Character Designer: Jean Gillmore, Set Dresser: Ann M. Rockwell, Editorial Manager: Julie M. McDonald, Assistant Editor: Dana Mulligan, Editorial Coordinator: Deirdre Morrison, Production Coordinator: Ellen Devine, Unit Publicist: Lauren Beth Strogoff, Sound Re-Recording Mixer: Gary Summers, Supervising Sound Editor: Tim Holland, Sound Effects Editor: Pat Jackson, Sound Design Assistant: Tom Myers, Assistant Sound Editor: Dan Engstrom, Casting Consultant: Ruth Lambert, ADR Voice Casting: Mickie McGowan
2nd rowExecutive Producer: Robert W. Cort, Screenplay: Jim Strain, Original Music Composer: James Horner, Director: Joe Johnston, Editor: Robert Dalva, Casting: Nancy Foy, Animation Supervisor: Kyle Balda, Production Design: James D. Bissell, Producer: William Teitler, Director of Photography: Thomas E. Ackerman, Novel: Chris van Allsburg
3rd rowDirector: Howard Deutch, Characters: Mark Steven Johnson, Writer: Mark Steven Johnson, Sound Recordist: Jack Keller
4th rowDirector: Forest Whitaker, Screenplay: Terry McMillan, Producer: Caron K, Executive Producer: Terry McMillan, Novel: Terry McMillan, Original Music Composer: Kenneth Edmonds
5th rowOriginal Music Composer: Alan Silvestri, Director of Photography: Elliot Davis, Screenplay: Albert Hackett, Producer: Nancy Meyers, Director: Charles Shyer, Editor: Adam Bernardi

Common Values

ValueCountFrequency (%)
Director: Georges Méliès 35
 
0.1%
Director: Christian I. Nyby II 13
 
< 0.1%
Director: Norman McLaren 12
 
< 0.1%
Director: Charlie Chaplin, Writer: Charlie Chaplin 12
 
< 0.1%
Director: Frederick Wiseman 12
 
< 0.1%
Director: Gerald Thomas, Screenplay: Talbot Rothwell 11
 
< 0.1%
Director: Stan Brakhage 10
 
< 0.1%
Director: James H. White 10
 
< 0.1%
Director: James Benning 10
 
< 0.1%
Director: William K.L. Dickson 9
 
< 0.1%
Other values (42933) 44509
98.1%
(Missing) 723
 
1.6%

Length

2023-06-09T20:20:30.227460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
director 69179
 
5.3%
producer 36150
 
2.7%
editor 30798
 
2.3%
music 23802
 
1.8%
writer 20800
 
1.6%
design 20256
 
1.5%
of 19577
 
1.5%
photography 19508
 
1.5%
production 17620
 
1.3%
screenplay 15719
 
1.2%
Other values (78871) 1041517
79.2%

Most occurring characters

ValueCountFrequency (%)
1270337
 
12.2%
r 863480
 
8.3%
e 795230
 
7.6%
o 677278
 
6.5%
i 676464
 
6.5%
a 594616
 
5.7%
t 523212
 
5.0%
n 497393
 
4.8%
: 344604
 
3.3%
s 343041
 
3.3%
Other values (323) 3845764
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7152183
68.6%
Uppercase Letter 1316789
 
12.6%
Space Separator 1270337
 
12.2%
Other Punctuation 679118
 
6.5%
Dash Punctuation 12365
 
0.1%
Decimal Number 265
 
< 0.1%
Other Letter 163
 
< 0.1%
Control 151
 
< 0.1%
Open Punctuation 16
 
< 0.1%
Close Punctuation 16
 
< 0.1%
Other values (5) 16
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 863480
12.1%
e 795230
11.1%
o 677278
9.5%
i 676464
9.5%
a 594616
 
8.3%
t 523212
 
7.3%
n 497393
 
7.0%
s 343041
 
4.8%
c 322132
 
4.5%
l 286665
 
4.0%
Other values (122) 1572672
22.0%
Other Letter
ValueCountFrequency (%)
ا 9
 
5.5%
م 7
 
4.3%
7
 
4.3%
7
 
4.3%
5
 
3.1%
د 5
 
3.1%
4
 
2.5%
4
 
2.5%
4
 
2.5%
ع 4
 
2.5%
Other values (76) 107
65.6%
Uppercase Letter
ValueCountFrequency (%)
D 162117
12.3%
S 140855
 
10.7%
P 118060
 
9.0%
C 111694
 
8.5%
M 109905
 
8.3%
A 80737
 
6.1%
E 72375
 
5.5%
J 55130
 
4.2%
R 53353
 
4.1%
B 52132
 
4.0%
Other values (75) 360431
27.4%
Decimal Number
ValueCountFrequency (%)
3 183
69.1%
2 37
 
14.0%
4 18
 
6.8%
0 8
 
3.0%
5 7
 
2.6%
8 4
 
1.5%
9 4
 
1.5%
7 3
 
1.1%
1 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
: 344604
50.7%
, 300060
44.2%
. 31072
 
4.6%
' 2561
 
0.4%
& 698
 
0.1%
/ 98
 
< 0.1%
" 24
 
< 0.1%
· 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 12362
> 99.9%
3
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
6
75.0%
2
 
25.0%
Nonspacing Mark
ValueCountFrequency (%)
̃ 2
50.0%
́ 2
50.0%
Space Separator
ValueCountFrequency (%)
1270337
100.0%
Control
ValueCountFrequency (%)
151
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8468205
81.2%
Common 1962281
 
18.8%
Cyrillic 749
 
< 0.1%
Hangul 98
 
< 0.1%
Arabic 52
 
< 0.1%
Greek 17
 
< 0.1%
Han 13
 
< 0.1%
Inherited 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 863480
 
10.2%
e 795230
 
9.4%
o 677278
 
8.0%
i 676464
 
8.0%
a 594616
 
7.0%
t 523212
 
6.2%
n 497393
 
5.9%
s 343041
 
4.1%
c 322132
 
3.8%
l 286665
 
3.4%
Other values (143) 2888694
34.1%
Hangul
ValueCountFrequency (%)
7
 
7.1%
7
 
7.1%
5
 
5.1%
4
 
4.1%
4
 
4.1%
4
 
4.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
Other values (46) 55
56.1%
Cyrillic
ValueCountFrequency (%)
и 86
 
11.5%
а 70
 
9.3%
р 53
 
7.1%
о 49
 
6.5%
л 46
 
6.1%
е 45
 
6.0%
н 39
 
5.2%
к 38
 
5.1%
в 34
 
4.5%
с 31
 
4.1%
Other values (38) 258
34.4%
Common
ValueCountFrequency (%)
1270337
64.7%
: 344604
 
17.6%
, 300060
 
15.3%
. 31072
 
1.6%
- 12362
 
0.6%
' 2561
 
0.1%
& 698
 
< 0.1%
3 183
 
< 0.1%
151
 
< 0.1%
/ 98
 
< 0.1%
Other values (19) 155
 
< 0.1%
Arabic
ValueCountFrequency (%)
ا 9
17.3%
م 7
13.5%
د 5
9.6%
ع 4
7.7%
ي 4
7.7%
ی 4
7.7%
ل 3
 
5.8%
ح 3
 
5.8%
ن 3
 
5.8%
پ 2
 
3.8%
Other values (7) 8
15.4%
Greek
ValueCountFrequency (%)
ς 2
 
11.8%
ρ 2
 
11.8%
Φ 1
 
5.9%
ν 1
 
5.9%
α 1
 
5.9%
β 1
 
5.9%
Α 1
 
5.9%
ο 1
 
5.9%
γ 1
 
5.9%
ώ 1
 
5.9%
Other values (5) 5
29.4%
Han
ValueCountFrequency (%)
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Other values (3) 3
23.1%
Inherited
ValueCountFrequency (%)
̃ 2
50.0%
́ 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10408883
99.8%
None 21602
 
0.2%
Cyrillic 749
 
< 0.1%
Hangul 98
 
< 0.1%
Arabic 52
 
< 0.1%
Punctuation 13
 
< 0.1%
CJK 13
 
< 0.1%
Latin Ext Additional 5
 
< 0.1%
Diacriticals 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1270337
 
12.2%
r 863480
 
8.3%
e 795230
 
7.6%
o 677278
 
6.5%
i 676464
 
6.5%
a 594616
 
5.7%
t 523212
 
5.0%
n 497393
 
4.8%
: 344604
 
3.3%
s 343041
 
3.3%
Other values (64) 3823228
36.7%
None
ValueCountFrequency (%)
é 5650
26.2%
á 2484
11.5%
í 1571
 
7.3%
ó 1396
 
6.5%
ö 1264
 
5.9%
ô 1143
 
5.3%
è 679
 
3.1%
ü 671
 
3.1%
ç 617
 
2.9%
ä 594
 
2.7%
Other values (106) 5533
25.6%
Cyrillic
ValueCountFrequency (%)
и 86
 
11.5%
а 70
 
9.3%
р 53
 
7.1%
о 49
 
6.5%
л 46
 
6.1%
е 45
 
6.0%
н 39
 
5.2%
к 38
 
5.1%
в 34
 
4.5%
с 31
 
4.1%
Other values (38) 258
34.4%
Arabic
ValueCountFrequency (%)
ا 9
17.3%
م 7
13.5%
د 5
9.6%
ع 4
7.7%
ي 4
7.7%
ی 4
7.7%
ل 3
 
5.8%
ح 3
 
5.8%
ن 3
 
5.8%
پ 2
 
3.8%
Other values (7) 8
15.4%
Hangul
ValueCountFrequency (%)
7
 
7.1%
7
 
7.1%
5
 
5.1%
4
 
4.1%
4
 
4.1%
4
 
4.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
Other values (46) 55
56.1%
Punctuation
ValueCountFrequency (%)
6
46.2%
3
23.1%
2
 
15.4%
2
 
15.4%
Latin Ext Additional
ValueCountFrequency (%)
3
60.0%
1
 
20.0%
1
 
20.0%
Diacriticals
ValueCountFrequency (%)
̃ 2
50.0%
́ 2
50.0%
CJK
ValueCountFrequency (%)
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Other values (3) 3
23.1%

release_year
Real number (ℝ)

Distinct135
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1991.88
Minimum1874
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:30.441461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1874
5-th percentile1941
Q11978
median2001
Q32010
95-th percentile2015
Maximum2020
Range146
Interquartile range (IQR)32

Descriptive statistics

Standard deviation24.055565
Coefficient of variation (CV)0.012076814
Kurtosis0.83912906
Mean1991.88
Median Absolute Deviation (MAD)12
Skewness-1.2245988
Sum90363629
Variance578.67021
MonotonicityNot monotonic
2023-06-09T20:20:30.657459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014 1973
 
4.3%
2015 1905
 
4.2%
2013 1890
 
4.2%
2012 1722
 
3.8%
2011 1667
 
3.7%
2016 1604
 
3.5%
2009 1585
 
3.5%
2010 1501
 
3.3%
2008 1470
 
3.2%
2007 1319
 
2.9%
Other values (125) 28730
63.3%
ValueCountFrequency (%)
1874 1
 
< 0.1%
1878 1
 
< 0.1%
1883 1
 
< 0.1%
1887 1
 
< 0.1%
1888 2
 
< 0.1%
1890 5
 
< 0.1%
1891 6
< 0.1%
1892 3
 
< 0.1%
1893 1
 
< 0.1%
1894 13
< 0.1%
ValueCountFrequency (%)
2020 1
 
< 0.1%
2018 5
 
< 0.1%
2017 531
 
1.2%
2016 1604
3.5%
2015 1905
4.2%
2014 1973
4.3%
2013 1890
4.2%
2012 1722
3.8%
2011 1667
3.7%
2010 1501
3.3%

return
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1256
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean660.18833
Minimum0
Maximum12396383
Zeros40043
Zeros (%)88.3%
Negative0
Negative (%)0.0%
Memory size354.5 KiB
2023-06-09T20:20:30.891459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2.54
Maximum12396383
Range12396383
Interquartile range (IQR)0

Descriptive statistics

Standard deviation74701.525
Coefficient of variation (CV)113.15184
Kurtosis20668.4
Mean660.18833
Median Absolute Deviation (MAD)0
Skewness138.31428
Sum29950104
Variance5.5803179 × 109
MonotonicityNot monotonic
2023-06-09T20:20:31.111460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 40043
88.3%
0.01 64
 
0.1%
0.02 38
 
0.1%
1 34
 
0.1%
0.08 29
 
0.1%
0.03 27
 
0.1%
0.06 27
 
0.1%
1.1 26
 
0.1%
0.62 25
 
0.1%
1.2 23
 
0.1%
Other values (1246) 5030
 
11.1%
ValueCountFrequency (%)
0 40043
88.3%
0.01 64
 
0.1%
0.02 38
 
0.1%
0.03 27
 
0.1%
0.04 19
 
< 0.1%
0.05 22
 
< 0.1%
0.06 27
 
0.1%
0.07 18
 
< 0.1%
0.08 29
 
0.1%
0.09 16
 
< 0.1%
ValueCountFrequency (%)
12396383 1
< 0.1%
8500000 1
< 0.1%
4197476.62 1
< 0.1%
2755584 1
< 0.1%
1018619.28 1
< 0.1%
1000000 1
< 0.1%
26881.72 1
< 0.1%
12890.39 1
< 0.1%
5330.34 1
< 0.1%
4133.33 1
< 0.1%

Interactions

2023-06-09T20:20:19.750459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:06.877461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:08.505484image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:10.046497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:11.580460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:13.316461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:14.940460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:16.465461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:18.081460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:19.930499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:07.075459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:08.683504image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:10.228461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:11.760499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:13.544501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:15.120459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:16.656502image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:18.274463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:20.098461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:07.240458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:08.844460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:10.387475image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:11.916517image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:13.705467image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:15.277458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:16.821460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:18.441460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:20.271461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:07.412461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:09.009501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:10.551467image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:12.278470image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:13.880460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:15.438460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:16.993460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:18.614497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:20.442500image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:07.581460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:09.182505image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:10.722462image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:12.440460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:14.061460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:15.606495image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:17.171497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:18.790460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:20.617522image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:07.760460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:09.354506image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:10.889459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:12.610461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:14.236465image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:15.775508image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:17.353461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:18.966502image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:20.796462image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:07.933458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:09.517461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:11.057460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:12.775467image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:14.402465image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:15.937499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:17.527499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:19.141459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:20.980460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:08.122466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:09.697460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:11.229507image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:12.955461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:14.581459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:16.114469image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:17.710460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:19.333469image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:21.166460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:08.317460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:09.875468image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:11.410497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:13.138459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:14.764464image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:16.294498image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:17.898502image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-09T20:20:19.537461image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-09T20:20:31.297499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
budgetidpopularityrevenueruntimevote_averagevote_countrelease_yearreturnoriginal_languagestatus
budget1.000-0.2550.4630.6450.2270.0720.4840.1410.7710.0000.000
id-0.2551.000-0.410-0.278-0.206-0.149-0.4330.392-0.2630.0710.056
popularity0.463-0.4101.0000.4910.3070.2410.8930.1850.4460.0000.000
revenue0.645-0.2780.4911.0000.2540.1270.5130.1040.8490.0000.000
runtime0.227-0.2060.3070.2541.0000.1930.2900.0340.2340.1110.000
vote_average0.072-0.1490.2410.1270.1931.0000.318-0.0080.1210.0700.019
vote_count0.484-0.4330.8930.5130.2900.3181.0000.1970.4730.0000.000
release_year0.1410.3920.1850.1040.034-0.0080.1971.0000.0850.1440.028
return0.771-0.2630.4460.8490.2340.1210.4730.0851.0000.0000.000
original_language0.0000.0710.0000.0000.1110.0700.0000.1440.0001.0000.000
status0.0000.0560.0000.0000.0000.0190.0000.0280.0000.0001.000

Missing values

2023-06-09T20:20:21.717459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-09T20:20:22.346459image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-06-09T20:20:23.057466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

belongs_to_collectionbudgetgenresidoriginal_languageoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagevote_countcastcrewrelease_yearreturn
0Toy Story Collection30000000Animation, Comedy, Family862enLed by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.21.95Pixar Animation StudiosUnited States of America1995-10-30373554033.081.0EnglishReleasedNaNToy Story7.75415Tom Hanks, Tim Allen, Don Rickles, Jim Varney, Wallace Shawn, John Ratzenberger, Annie Potts, John Morris, Erik von Detten, Laurie Metcalf, R. Lee Ermey, Sarah Freeman, Penn JilletteDirector: John Lasseter, Screenplay: Alec Sokolow, Producer: Ralph Guggenheim, Executive Producer: Steve Jobs, Editor: Robert Gordon, Art Direction: Ralph Eggleston, Foley Editor: Mary Helen Leasman, Animation: Ken Willard, ADR Editor: Marilyn McCoppen, Orchestrator: Don Davis, Color Timer: Dale E. Grahn, CG Painter: William Cone, Original Story: Andrew Stanton, Post Production Supervisor: Patsy Bouge, Sculptor: Shelley Daniels Lekven, Animation Director: Rich Quade, Music: Randy Newman, Layout: Desirée Mourad, Music Editor: James Flamberg, Negative Cutter: Rick Mackay, Title Designer: Susan Bradley, Supervising Technical Director: William Reeves, Songs: Randy Newman, Supervising Animator: Pete Docter, Sound Designer: Gary Rydstrom, Production Supervisor: Karen Robert Jackson, Executive Music Producer: Chris Montan, Visual Effects Supervisor: Thomas Porter, Visual Effects: Brian M. Rosen, Lighting Supervisor: Galyn Susman, Character Designer: Jean Gillmore, Set Dresser: Ann M. Rockwell, Editorial Manager: Julie M. McDonald, Assistant Editor: Dana Mulligan, Editorial Coordinator: Deirdre Morrison, Production Coordinator: Ellen Devine, Unit Publicist: Lauren Beth Strogoff, Sound Re-Recording Mixer: Gary Summers, Supervising Sound Editor: Tim Holland, Sound Effects Editor: Pat Jackson, Sound Design Assistant: Tom Myers, Assistant Sound Editor: Dan Engstrom, Casting Consultant: Ruth Lambert, ADR Voice Casting: Mickie McGowan199512.45
1NaN65000000Adventure, Fantasy, Family8844enWhen siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures.17.02TriStar Pictures, Teitler Film, Interscope CommunicationsUnited States of America1995-12-15262797249.0104.0English, FrançaisReleasedRoll the dice and unleash the excitement!Jumanji6.92413Robin Williams, Jonathan Hyde, Kirsten Dunst, Bradley Pierce, Bonnie Hunt, Bebe Neuwirth, David Alan Grier, Patricia Clarkson, Adam Hann-Byrd, Laura Bell Bundy, James Handy, Gillian Barber, Brandon Obray, Cyrus Thiedeke, Gary Joseph Thorup, Leonard Zola, Lloyd Berry, Malcolm Stewart, Annabel Kershaw, Darryl Henriques, Robyn Driscoll, Peter Bryant, Sarah Gilson, Florica Vlad, June Lion, Brenda LockmullerExecutive Producer: Robert W. Cort, Screenplay: Jim Strain, Original Music Composer: James Horner, Director: Joe Johnston, Editor: Robert Dalva, Casting: Nancy Foy, Animation Supervisor: Kyle Balda, Production Design: James D. Bissell, Producer: William Teitler, Director of Photography: Thomas E. Ackerman, Novel: Chris van Allsburg19954.04
2Grumpy Old Men Collection0Romance, Comedy15602enA family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max.11.71Warner Bros., Lancaster GateUnited States of America1995-12-220.0101.0EnglishReleasedStill Yelling. Still Fighting. Still Ready for Love.Grumpier Old Men6.592Walter Matthau, Jack Lemmon, Ann-Margret, Sophia Loren, Daryl Hannah, Burgess Meredith, Kevin PollakDirector: Howard Deutch, Characters: Mark Steven Johnson, Writer: Mark Steven Johnson, Sound Recordist: Jack Keller19950.00
3NaN16000000Comedy, Drama, Romance31357enCheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe.3.86Twentieth Century Fox Film CorporationUnited States of America1995-12-2281452156.0127.0EnglishReleasedFriends are the people who let you be yourself... and never let you forget it.Waiting to Exhale6.134Whitney Houston, Angela Bassett, Loretta Devine, Lela Rochon, Gregory Hines, Dennis Haysbert, Michael Beach, Mykelti Williamson, Lamont Johnson, Wesley SnipesDirector: Forest Whitaker, Screenplay: Terry McMillan, Producer: Caron K, Executive Producer: Terry McMillan, Novel: Terry McMillan, Original Music Composer: Kenneth Edmonds19955.09
4Father of the Bride Collection0Comedy11862enJust when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own.8.39Sandollar Productions, Touchstone PicturesUnited States of America1995-02-1076578911.0106.0EnglishReleasedJust When His World Is Back To Normal... He's In For The Surprise Of His Life!Father of the Bride Part II5.7173Steve Martin, Diane Keaton, Martin Short, Kimberly Williams-Paisley, George Newbern, Kieran Culkin, BD Wong, Peter Michael Goetz, Kate McGregor-Stewart, Jane Adams, Eugene Levy, Lori AlanOriginal Music Composer: Alan Silvestri, Director of Photography: Elliot Davis, Screenplay: Albert Hackett, Producer: Nancy Meyers, Director: Charles Shyer, Editor: Adam Bernardi19950.00
5NaN60000000Action, Crime, Drama, Thriller949enObsessive master thief, Neil McCauley leads a top-notch crew on various insane heists throughout Los Angeles while a mentally unstable detective, Vincent Hanna pursues him without rest. Each man recognizes and respects the ability and the dedication of the other even though they are aware their cat-and-mouse game may end in violence.17.92Regency Enterprises, Forward Pass, Warner Bros.United States of America1995-12-15187436818.0170.0English, EspañolReleasedA Los Angeles Crime SagaHeat7.71886Al Pacino, Robert De Niro, Val Kilmer, Jon Voight, Tom Sizemore, Diane Venora, Amy Brenneman, Ashley Judd, Mykelti Williamson, Natalie Portman, Ted Levine, Tom Noonan, Tone Loc, Hank Azaria, Wes Studi, Dennis Haysbert, Danny Trejo, Henry Rollins, William Fichtner, Kevin Gage, Susan Traylor, Jerry Trimble, Ricky Harris, Jeremy Piven, Xander Berkeley, Begonya Plaza, Rick Avery, Hazelle Goodman, Ray Buktenica, Max Daniels, Vince Deadrick Jr., Steven Ford, Farrah Forke, Patricia Healy, Paul Herman, Cindy Katz, Brian Libby, Dan Martin, Mario Roberts, Thomas Rosales, Jr., Yvonne Zima, Mick Gould, Bud Cort, Viviane Vives, Kim Staunton, Martin Ferrero, Brad Baldridge, Andrew Camuccio, Kenny Endoso, Kimberly Flynn, Niki Harris, Bill McIntosh, Rick Marzan, Terry Miller, Daniel O'Haco, Kai Soremekun, Peter Blackwell, Trevor Coppola, Mary Kircher, Darin Mangan, Robert Miranda, Manny Perry, Iva Franks Singer, Tim Werner, Philip EttingtonDirector: Michael Mann, Screenplay: Michael Mann, Producer: Michael Mann, Original Music Composer: Elliot Goldenthal, Director of Photography: Dante Spinotti, Editor: Tom Rolf, Casting: Jane Brody, Production Design: Neil Spisak, Art Direction: Margie Stone McShirley, Costume Design: Deborah Lynn Scott, Music Editor: Michael Connell, Supervising Sound Editor: Larry Kemp, Special Effects Coordinator: Terry D. Frazee, Special Effects: Donald Frazee, Visual Effects Supervisor: Neil Krepela, Stunt Coordinator: Joel Kramer, Stunts: Doug Coleman, Set Decoration: Anne H. Ahrens, Costume Supervisor: Darryl M. Athons, Script Supervisor: Cate Hardman, Art Department Coordinator: Oscar Mazzola, Assistant Art Director: Dianne Wager, Construction Coordinator: Anthony Lattanzio, Assistant Costume Designer: David Le Vey, Hairstylist: Ilona Herman, Key Hair Stylist: Vera Mitchell, Makeup Artist: Ken Diaz, Dialogue Editor: Lauren Stephens, Camera Operator: Gary Jay, Steadicam Operator: James Muro, Still Photographer: Frank Connor, First Assistant Camera: Chris Moseley, Rigging Gaffer: Frank Dorowsky, Music Supervisor: Budd Carr, First Assistant Editor: Ray Boniker, Sound Re-Recording Mixer: Mark Smith, Technical Supervisor: Mick Gould, Executive Producer: Arnon Milchan, Associate Producer: Gusmano Cesaretti, Unit Production Manager: Christopher Cronyn, Assistant Director: Michael Waxman, Casting Associate: Alison E. McBryde, Set Costumer: Marsha Bozeman, Digital Effects Supervisor: Jeff Wells, Sound Recordist: Philip Rogers, Additional Soundtrack: Jimmy Webb19953.12
6NaN58000000Comedy, Romance11860enAn ugly duckling having undergone a remarkable change, still harbors feelings for her crush: a carefree playboy, but not before his business-focused brother has something to say about it.6.68Paramount Pictures, Scott Rudin Productions, Mirage Enterprises, Sandollar Productions, Constellation Entertainment, Worldwide, Mont Blanc Entertainment GmbHGermany, United States of America1995-12-150.0127.0Français, EnglishReleasedYou are cordially invited to the most surprising merger of the year.Sabrina6.2141Harrison Ford, Julia Ormond, Greg Kinnear, Angie Dickinson, Nancy Marchand, John Wood, Richard Crenna, Lauren Holly, Dana Ivey, Fanny Ardant, Patrick Bruel, Paul Giamatti, Miriam Colón, Elizabeth Franz, Valérie Lemercier, Becky Ann Baker, John C. Vennema, Margo Martindale, J. Smith-Cameron, Christine Luneau-Lipton, Michael Dees, Denis Holmes, Jo-Jo Lowe, Ira Wheeler, Philippa Cooper, Ayako Kawahara, François Genty, Guillaume Gallienne, Inés Sastre, Phina Oruche, Andrea Behalikova, Jennifer Herrera, Kristina Kumlin, Eva Linderholm, Carmen Chaplin, Micheline Van de Velde, Joanna Rhodes, Alan Boone, Patrick Forster-Delmas, Kentaro Matsuo, Peter McKernan, Ed Connelly, Ronald L. Schwary, Alvin Lum, Siching Song, Phil Nee, Randy Becker, Susan Browning, Anthony Mondal, Peter Parks, Woodrow Asai, Eric Bruno Borgman, Michael Cline, Christopher Del Gaudio, Philippe Hartmann, Jerry Quinn, Dori RosenthalDirector: Sydney Pollack, Screenplay: David Rayfiel, Producer: Scott Rudin, Original Music Composer: John Williams, Editor: Fredric Steinkamp, Casting: David Rubin, Production Design: Brian Morris, Makeup Artist: Joseph A. Campayno, Hairstylist: Stephen G. Bishop, Co-Costume Designer: Gary Jones, Costume Design: Ann Roth, Set Decoration: Amy Marshall, Art Department Coordinator: Miriam Schapiro, Sound mixer: Danny Michael, Sound Re-Recording Mixer: Scott Millan, Supervising Sound Effects Editor: Myron Nettinga, Sound Effects Editor: Joe Earle, Supervising Sound Editor: J. Paul Huntsman, Boom Operator: Andrew Schmetterling, Dialogue Editor: Benjamin Beardwood, Script Supervisor: Mary A. Kelly, Still Photographer: Brian Hamill, Camera Operator: Giovanni Fiore Coltellacci, Director of Photography: Giuseppe Rotunno, Casting Associate: Ronna Kress, Assistant Costume Designer: Michelle Matland, Costume Supervisor: Donna Maloney, First Assistant Editor: Karl F. Steinkamp, Executive Producer: Ronald L. Schwary, Art Direction: John Kasarda, Production Manager: Ronald L. Schwary, Production Supervisor: Thomas A. Imperato, Casting Assistant: Bill Kaufman, Location Manager: Joseph E. Iberti, Production Coordinator: Katherine Kennedy19950.00
7NaN0Action, Adventure, Drama, Family45325enA mischievous young boy, Tom Sawyer, witnesses a murder by the deadly Injun Joe. Tom becomes friends with Huckleberry Finn, a boy with no future and no family. Tom has to choose between honoring a friendship or honoring an oath because the town alcoholic is accused of the murder. Tom and Huck go through several adventures trying to retrieve evidence.2.56Walt Disney PicturesUnited States of America1995-12-220.097.0English, DeutschReleasedThe Original Bad Boys.Tom and Huck5.445Jonathan Taylor Thomas, Brad Renfro, Rachael Leigh Cook, Michael McShane, Amy Wright, Eric Schweig, Tamara MelloScreenplay: Stephen Sommers, Director: Peter Hewitt, Novel: Mark Twain19950.00
8NaN35000000Action, Adventure, Thriller9091enInternational action superstar Jean Claude Van Damme teams with Powers Boothe in a Tension-packed, suspense thriller, set against the back-drop of a Stanley Cup game.Van Damme portrays a father whose daughter is suddenly taken during a championship hockey game. With the captors demanding a billion dollars by game's end, Van Damme frantically sets a plan in motion to rescue his daughter and abort an impending explosion before the final buzzer...5.23Universal Pictures, Imperial Entertainment, Signature EntertainmentUnited States of America1995-12-2264350171.0106.0EnglishReleasedTerror goes into overtime.Sudden Death5.5174Jean-Claude Van Damme, Powers Boothe, Dorian Harewood, Raymond J. Barry, Ross Malinger, Whittni WrightDirector: Peter Hyams, Screenplay: Gene Quintano, Producer: Howard Baldwin, Music: John Debney, Director of Photography: Peter Hyams, Editor: Steven Kemper19951.84
9James Bond Collection58000000Adventure, Action, Thriller710enJames Bond must unmask the mysterious head of the Janus Syndicate and prevent the leader from utilizing the GoldenEye weapons system to inflict devastating revenge on Britain.14.69United Artists, Eon ProductionsUnited Kingdom, United States of America1995-11-16352194034.0130.0English, Pусский, EspañolReleasedNo limits. No fears. No substitutes.GoldenEye6.61194Pierce Brosnan, Sean Bean, Izabella Scorupco, Famke Janssen, Joe Don Baker, Judi Dench, Gottfried John, Robbie Coltrane, Alan Cumming, Tchéky Karyo, Desmond Llewelyn, Samantha Bond, Michael Kitchen, Serena Gordon, Simon Kunz, Billy J. Mitchell, Constantine Gregory, Minnie Driver, Michelle Arthur, Ravil IsyanovDirector: Martin Campbell, Characters: Ian Fleming, Screenplay: Bruce Feirstein, Producer: Anthony Waye, Executive Producer: Tom Pevsner, Original Music Composer: Eric Serra, Songs: Tina Turner, Director of Photography: Phil Meheux, Editor: Terry Rawlings, Casting: Pam Dixon, Production Design: Peter Lamont, Art Direction: Charles Dwight Lee, Set Decoration: Michael Ford, Costume Design: Lindy Hemming, Story: Michael France, Assistant Art Director: Steven Lawrence, Construction Coordinator: Tony Graysmark, Supervising Art Director: Neil Lamont, Music Editor: Robert Hathaway, Armorer: Charles Bodycomb, Script Supervisor: June Randall, Camera Operator: Tim Wooster, Still Photographer: George Whitear, Gaffer: Steve Foster, Special Effects Supervisor: Chris Corbould, Visual Effects Coordinator: Mara Bryan, Visual Effects Editor: Tim Grover, Dialogue Editor: Peter Musgrave, Sound Re-Recording Mixer: John Hayward, Supervising Sound Editor: Jim Shields, Sound Recordist: David John19956.07
belongs_to_collectionbudgetgenresidoriginal_languageoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagevote_countcastcrewrelease_yearreturn
45356NaN0NaN67179itSentenced to life imprisonment for illegal activities, Italian International member Giulio Manieri holds on to his political ideals while struggling against madness in the loneliness of his prison cell.0.23NaNNaN1972-01-010.090.0ItalianoReleasedNaNSt. Michael Had a Rooster6.03Giulio Brogi, Renato Cestiè, Vito Cipolla, Daniele DublinoNovel: Leo Tolstoy, Screenplay: Paolo Taviani, Director: Vittorio Taviani19720.0
45357NaN0Horror, Mystery, Thriller84419enAn unsuccessful sculptor saves a madman named "The Creeper" from drowning. Seeing an opportunity for revenge, he tricks the psycho into murdering his critics.0.22Universal PicturesUnited States of America1946-03-290.065.0EnglishReleasedMeet...The CREEPER!House of Horrors6.38Rondo Hatton, Robert Lowery, Virginia Grey, Bill Goodwin, Martin Kosleck, Alan Napier, Howard Freeman, Virginia Christine, Joan Shawlee, Byron Foulger, Syd SaylorSet Decoration: Ralph Warrington, Art Direction: Abraham Grossman, Makeup Artist: Jack P. Pierce, Editor: Philip Cahn, Director: Jean Yarbrough, Screenplay: George Bricker, Director of Photography: Maury Gertsman, Original Story: Dwight V. Babcock, Producer: Ben Pivar19460.0
45358NaN0Mystery, Horror390959enIn this true-crime documentary, we delve into the murder spree that was the inspiration for Joe Berlinger's "Book of Shadows: Blair Witch 2".0.08NaNNaN2000-10-220.045.0EnglishReleasedNaNShadow of the Blair Witch7.02Tony Abatemarco, Andre Brooks, Mariclare Costello, Bill Dreggors, Apollo Dukakis, Philip Friedman, James Gleason, Dilva Henry, Bari Hochwald, Wendy Hoffman, John Huck, Rachel Moskowitz, Sandy Mulvihill, Roger Nolan, Chris Parnell, Byrne Piven, Richard Sexton, Rich Williams, Ray XifoDirector: Ben Rock, Writer: Ben Rock, Producer: Ben Rock, Executive Producer: Pirie Jones, Line Producer: Kimberly Rach, Original Music Composer: Sasha Bogdanowitsch, Cinematography: Neal Fredericks, Editor: George Rizkallah, Casting: David Giella, Production Design: Steven P. Duchscherer, Art Direction: Chris Davis, Makeup Department Head: Kimberly Eckhout, Makeup Artist: Hillary Wallace, Hairstylist: Hillary Wallace, Hair Department Head: Kimberly Eckhout, Assistant Director: Aaron Walters, Art Department Coordinator: Shaun Richkind, Sound Designer: Jeremy M. Gilleece, Sound Mixer: Jeremy M. Gilleece, Boom Operator: Jackson Hilliard, Still Photographer: James Grossman, Gaffer: Dale Obert, Costume Design: Ann Roth20000.0
45359NaN0Horror289923enA film archivist revisits the story of Rustin Parr, a hermit thought to have murdered seven children while under the possession of the Blair Witch.0.39Neptune Salad Entertainment, Pirie ProductionsUnited States of America2000-10-030.030.0EnglishReleasedDo you know what happened 50 years before "The Blair Witch Project"?The Burkittsville 77.01Monty Bane, Lucy Butler, David Grammer, Bill Dreggors, Frank Pastor, Heather Donahue, Joshua Leonard, Michael C. WilliamsDirector: Ben Rock, Writer: Ben Rock20000.0
45360NaN0Science Fiction222848enIt's the year 3000 AD. The world's most dangerous women are banished to a remote asteroid 45 million light years from earth. Kira Murphy doesn't belong; wrongfully accused of a crime she did not commit, she's thrown in this interplanetary prison and left to her own defenses. But Kira's a fighter, and soon she finds herself in the middle of a female gang war; where everyone wants a piece of the action... and a piece of her! "Caged Heat 3000" takes the Women-in-Prison genre to a whole new level... and a whole new galaxy!0.66Concorde-New HorizonsUnited States of America1995-01-010.085.0EnglishReleasedNaNCaged Heat 30003.51Lisa Boyle, Kena Land, Zaneta Polard, Don Yanan, Debra K. Beatty, Mark Sikes, Robert J. Ferrelli, Ellyn Dawn Humphreys, Ron Jeremy, Ben RamseyExecutive Producer: Mike Elliott, Director: Aaron Osborne, Producer: Mike Upton, Writer: Emile Dupont, Editor: Felix Chamberlain19950.0
45361NaN0Drama, Action, Romance30840enYet another version of the classic epic, with enough variation to make it interesting. The story is the same, but some of the characters are quite different from the usual, in particular Uma Thurman's very special maid Marian. The photography is also great, giving the story a somewhat darker tone.5.68Westdeutscher Rundfunk (WDR), Working Title Films, 20th Century Fox Television, CanWest Global CommunicationsCanada, Germany, United Kingdom, United States of America1991-05-130.0104.0EnglishReleasedNaNRobin Hood5.726Patrick Bergin, Uma Thurman, David Morrissey, Jürgen Prochnow, Jeroen KrabbéDirector: John Irvin, Writer: John McGrath, Story: Sam Resnick, Producer: Sarah Radclyffe, Music: Geoffrey Burgon, Director of Photography: Jason Lehel, Editor: Peter Tanner, Casting: Susie Figgis19910.0
45362NaN0Drama111109tlAn artist struggles to finish his work while a storyline about a cult plays in his head.0.18Sine OliviaPhilippines2011-11-170.0360.0NaNReleasedNaNCentury of Birthing9.03Angel Aquino, Perry Dizon, Hazel Orencio, Joel Torre, Bart Guingona, Soliman Cruz , Roeder, Angeli Bayani, Dante Perez, Betty Uy-Regala, ModestaDirector: Lav Diaz, Writer: Lav Diaz, Production Design: Dante Perez, Music: Lav Diaz, Editor: Lav Diaz, Cinematography: Lav Diaz20110.0
45363NaN0Action, Drama, Thriller67758enWhen one of her hits goes wrong, a professional assassin ends up with a suitcase full of a million dollars belonging to a mob boss ...0.90American World PicturesUnited States of America2003-08-010.090.0EnglishReleasedA deadly game of wits.Betrayal3.86Erika Eleniak, Adam Baldwin, Julie du Page, James Remar, Damian Chapa, Louis Mandylor, Tom Wright, Jeremy Lelliott, James Quattrochi, Jason Widener, Joe Sabatino, Kiko Ellsworth, Don Swayze, Peter Dobson, Darrell DubovskyDirector: Mark L. Lester, Screenplay: Jeffrey Goldenberg, Original Music Composer: Richard McHugh, Director of Photography: João Fernandes20030.0
45364NaN0NaN227506enIn a small town live two brothers, one a minister and the other one a hunchback painter of the chapel who lives with his wife. One dreadful and stormy night, a stranger knocks at the door asking for shelter. The stranger talks about all the good things of the earthly life the minister is missing because of his puritanical faith. The minister comes to accept the stranger's viewpoint but it is others who will pay the consequences because the minister will discover the human pleasures thanks to, ehem, his sister- in -law… The tormented minister and his cuckolded brother will die in a strange accident in the chapel and later an infant will be born from the minister's adulterous relationship.0.00YermolievRussia1917-10-210.087.0NaNReleasedNaNSatan Triumphant0.00Iwan Mosschuchin, Nathalie Lissenko, Pavel Pavlov, Aleksandr Chabrov, Vera OrlovaDirector: Yakov Protazanov, Producer: Joseph N. Ermolieff19170.0
45365NaN0NaN461257en50 years after decriminalisation of homosexuality in the UK, director Daisy Asquith mines the jewels of the BFI archive to take us into the relationships, desires, fears and expressions of gay men and women in the 20th century.0.16NaNUnited Kingdom2017-06-090.075.0EnglishReleasedNaNQueerama0.00NaNDirector: Daisy Asquith20170.0